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Abstract 


Despite being one of the oldest data structures in computer science, hash tables continue to be the 
focus of a great deal of both theoretical and empirical research. A central reason for this is that many of 
the fundamental properties that one desires from a hash table are difficult to achieve simultaneously; thus 
many variants offering different trade-offs have been proposed. 

This paper introduces Iceberg hashing, a hash table that simultaneously offers the strongest known 
guarantees on a large number of core properties. Iceberg hashing supports constant-time operations while 
improving on the state of the art for space efficiency, cache efficiency, and low failure probability. Iceberg 
hashing is also the first hash table to support a load factor of up to 1 — o(1) while being stable, meaning 
that the position where an element is stored only ever changes when resizes occur. In fact, in the setting 
where keys are O(log n) bits, the space guarantees that Iceberg hashing offers, namely that is uses at most 
log (2) + O(n log log n) bits to store n items from a universe U, matches a lower bound by Demaine et 
al. that applies to any stable hash table. 

Iceberg hashing introduces new general-purpose techniques for some of the most basic aspects of 
hash-table design. Notably, our indirection-free technique for dynamic resizing, which we call waterfall 
addressing, and our techniques for achieving stability and very-high probability guarantees, can be applied 
to any hash table that makes use of the front-yard/backyard paradigm for hash table design. 
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1 Introduction 


The hash table is one of the oldest and most fundamental data structures in computer science. Hash tables 
were invented by Hans Peter Luhn in 1953 during the development of IBM’s first commercial scientific 
computer, the IBM 701 [29]. Luhn’s implementation used what is now known as chained hashing[] Since 
items are addressed via pointers, chained hash tables waste space and offer poor data locality. 

In the nearly seven decades since, there has been a huge literature on hashing; some important milestones 
can be summarized in the following progression of work. Linear probing, which was introduced in 1954 [2943], 
achieves good data locality and constant-time operations in expectation, but scales poorly to high load 
factors. In the 1980s, Fredman, Komlós, and Szemerédi [19] showed how to achieve worst-case constant 
time queries and subsequent work [13)[14] showed how to dynamize this hash table, but at the cost of poor 
space efficiency and data locality. In the early 2000s, Cuckoo hashing [8}{15|[18]|40] was introduced, providing 
constant-time queries and updates with better space efficiency. Finally, in the past decade, several hash tables 
have been developed that offer a variety of even stronger performance guarantees, including 
very-high probability constant-time operations, very high load factors, etc. 

One of the great ironies in the study of hashing is that, even after seven decades of research and many 
proposed alternatives, chained hashing remains one of the most widely used hash table designs, even serving 
as the default in performance-oriented languages such as C++ [00I]. Since chaining is missing many of the 
desirable properties of other hash tables (space efficiency, data locality, constant-time operations, etc.), why 
is it that it continues to be so widely used? 

The reason why is simple—chaining offers one guarantee, referential stability, that is not offered by most 
other hash tables designs. Referential stability requires that elements not change location in the table, except 
for when table resizes are performed [OOII]. This is important in many settings: to reduce locking 
and increase concurrency; to allow pointers into the table; to support iterators through the hash table, etc. 

What makes stability algorithmically interesting is that the known techniques for achieving it are 
fundamentally at odds with the other desirable guarantees. Stability itself is easily achieved by storing 
pointers to elements in the hash table, rather than the elements themselves. But as in chaining, these pointers 
compromise other central guarantees, such as space efficiency and data locality. 

In fact, stability illustrates just one example of a more general phenomenon—that known techniques for 
achieving many central hash-table guarantees preclude others. Even in cases where we know how to achieve 
individual guarantees, the question of whether we can get these guarantees together in the same hash table is 
often much harder. Some of the most substantial breakthroughs in the field have been needed to achieve even 
basic combinations, e.g., high load factor and dynamic resizing [45], high load factor and constant-time oper- 
ations [4], or very recently, dynamically-resizable high load factor and constant-time operations [34]. And, as 
we shall discuss in more detail later, some other basic combinations are still well beyond the known techniques. 

Modern work on hashing focuses on the following core list of desirable guarantees: 


Time: 
e constant-time operations: insertions/queries/deletions take O(1) time w.h.p. 
e (1+ 0(1)) cache optimality: operations incur 1 + 0(1) cache misses in the external memory model. 


Space: 
e load factors of 1 — o(1): all but a o(1) fraction of space is used to store elements. 
e dynamic resizing: the table dynamically adjusts its space consumption to match the current size. 


Functionality: 
e very-high probability guarantees: the guarantees have subpolynomial failure probability. 
e referential stability: the only way that elements move around is when the table is resized. 


Each property individually has its own (sometimes extensive) line of research, and the question of whether 
optimal guarantees for all of the properties can be achieved together has remained a significant open problem. 


This paper: Iceberg hashing. This paper introduces the Iceberg hash table. Iceberg hashing matches 
the states of the art for all of the above properties simultaneously, and also improves the states of the art for 
space efficiency and failure probability. 


1 As Knuth points out in , the implementation may have also been the first use of linked lists in computer science. 


Iceberg hashing introduces new techniques for some of the most basic aspects of hash-table design. 
Notably, our indirection-free technique for dynamic resizing, which we call waterfall addressing, and our 
techniques for achieving stability and very-high probability guarantees, can be applied to any hash table that 
makes use of the backyarding paradigm for hash table design. 

Iceberg hashing also revisits one of the oldest approaches for designing space-efficient hash tables: back- 
yarding. Introduced in 1957 [43], the basic idea is that records are first hashed into bins in the front yard and 
if the target bin is full, the record is instead stored in a small backyard hash table. As long as the backyard is 
small, consisting of o(n) elements, we can afford to store it in a less space-efficient manner. In recent work, back- 
yarding has been used to achieve high space efficiency in constant-time hash tables [4621/22]. Our techniques 
allow for this space efficiency to be preserved, while also achieving the other core guarantees described above. 


1.1 The guarantees of an Iceberg hash table 


Referential stability. A hash table is said to be stable if whenever a new element is inserted, the position 
in which x (along with any value associated with x) is stored is guaranteed not to change until either x is 
deleted or the table is resized eaa 2 

Empirical work on the problem of designing space-efficient, stable hash tables dates back to the early 
1980s [E2347] (see also Knuth’s Volume 3 [29]). Much of the theoretical work on stability has focused 
on a weaker version of the property called value stability: values associated with keys are stable, but the 
keys need not be stored with those values and are allowed to movel] Demaine et al. give a general-purpose 
approach (Theorem 3 of {12]) for space-efficiently achieving value stability in any hash table, by adding an 
extra layer of indirection that can be encoded with just O(log log n) extra bits per key. Of course, such a layer 
of indirection is incompatible with data-locality, so if we want to achieve value stability (and, more generally, 
full stability) in a hash table that is also cache friendly, then an alternative approach must be taken. 

Besides the approach of using indirection [10|{11], a second common approach to achieving stability 
has been to consider open addressing schemes (such as linear probing) with deletions implemented using 
tombstones; in particular, this means that when an element is deleted, it is simply removed from the table, 
and no other elements are moved around. Despite both empirical work and theoretical work on 
analyzing such schemes, the complex dependencies between insertions and deletions over time have prevented 
any analysis from offering provable guarantees at high load factors (see discussion in JE! 


Our technique for stability: an unmanaged backyard. A trademark of the use of backyards in recent 
work BACACARN2284] has been the design of creative ways to move elements from the backyard to the 
front yard whenever space frees up in the latter (for example, Arbitman et al. store the backyard as a 
deamortized cuckoo hash table, and whenever a cuckoo eviction is performed, they check whether the element 
can instead be moved back to the front yard). 

Of course, another approach would be to simply leave the backyard unmanaged, allowing for elements to 
remain in the backyard even when space frees up in the frontyard. We prove a general-purpose result that we 
call the Iceberg Lemma, which establishes that backyards do not, in fact, require any maintenance to stay 
small[}| An essential ingredient of the Iceberg Lemma is that it bounds the size of the backyard not just with 
high probability, but also with super high probability (in fact, probability 1 — 1/2"/P°!¥leg”), This ends up 
being central to our data-structure design, as it allows for stability and super-high probability guarantees to 
be achieved simultaneously, without being at odds with one another. 

The approach of having an unmanaged backyard is analogous to the use of tombstones in open addressing. 
In both cases, one takes a data structure in which one would normally move elements around and one simply 


In addition to being required for any implementation of the C++ unordered map , Stability is an integral part of the design 
for the standard hash tables used at both Google and Facebook [I7]. Stable hash tables typically offer a Reserve function, 
which allows users to guarantee that the table will remain stable until the next time that it exceeds some reserved capacity, which 
is why stability is typically not required during resizing. 

3Value stability is sufficient for some applications of stability (e.g., storing pointers to values, so that the values can be directly 
edited) but not others (e.g., supporting iterators, storing pointers into the hash table that can be used to verify that a given 
key/value is present; designing concurrent hash tables that rely on elements staying put, etc.). 

4 And even if such guarantees were possible, the performance degradation that these schemes incur at high load factors 
would still appear to be problematic for proving time bounds on unsuccessful searches. 

5The name of the lemma stems from the fact that the majority of an iceberg remains naturally underwater, while only a small 
portion protrudes above. 


analyzes what happens if instead elements are always left in place. The result is that there are intricate 
circular dependencies between where elements reside over time, depending on the history of past insertions, 
deletions, and re-insertions. To overcome these dependencies and achieve super-high probability guarantees, 
our proof of the Iceberg Lemma makes use of a number of interesting combinatorial ideas. 


Using only O(loglogn) extra bits per key. The first hash table to achieve constant-time operations 
with a load factor of 1 — o(1) was that of Arbitman et al. [4]. They achieve a load factor of 1 — €, where 
e€ = O(/loglogn/V/logn). The same paper poses as an open question whether a smaller € is achievable. 
Recently, Liu, et al. [34] presented the first progress on this problem, shaving a \/log log n factor, and achieving 
e = O(1/ylogn) f 

Iceberg hashing further improves £ to O(loglogn/logn). This is an especially big improvement in the 
common case where keys consist of O(log n) bits. Here, Iceberg hashing uses only O(log logn) extra bits per 
key in comparison to the previous state-of-the-art of O(/log n) extra bits per key [34]. 

For O(log n)-bit keys, we also show how to implement Iceberg hashing as a succinct data structure, using 
only O(loglogn) extra bits per key when compared to the information-theoretic optimum. In achieving 
this space bound, our hash table is the first dynamic dictionary to match the lower bound of Demaine et 
al. (Theorem 2 of [12]) on the number of bits required by any hash table that stores elements by assigning them 
stable positions in an array. Thus our hash table has provably optimal space usage across all such hash tables. 

Interestingly, in addition to enabling a stable backyard, the Iceberg Lemma ends up independently playing 
an important role in our high-space-efficiency results. In particular, it allows for the use of backyarding as a 
way to store metadata succinctly. 


In-place dynamic resizing. A hash table supports dynamic resizing if the space consumption is a function 
of the current number of records n, rather than some upper bound N on the number of records that could ever 
be in the data structure. 

Arbitman, Naor, and Segev [4| pose the open question of how to maintain a constant-time, space-efficient 
hash table that supports dynamic resizing. Recently, Liu, Yin, and Yu [34] gave an elegant solution to this 
problem, in which records are stored in bins and each bin is represented space-efficiently with fine-grained 
memory allocations, where the bin is incrementally expanded/contracted by allocating/deallocating small 
chunks of memory. The resulting layer of indirection is incompatible with 1 + o(1) cache optimality. 


Our technique for indirection-free resizing: waterfall addressing. Waterfall addressing revisits the 
most natural approach to maintaining a space-efficient hash table, which is to simply incrementally resize the 
table by 1 + o(1) factors so that it always stays at a high load factor. The problem with this approach, and 
the reason that it has not been used in past work, is that each resize naively requires Q(n) work to rebuild the 
table, making the approach time inefficient. 

Waterfall addressing maps elements to bins in a way that offers the following guarantees. Whenever the 
table size increases by a 1 + o(1) factor, only a o(1) fraction of elements have their bin changed, and in fact, 
the only elements whose bin change are the ones that move into the newly created portion of the hash table. 
Moreover, waterfall addressing allows a time-efficient way to identify which elements need to be moved, so a 
resize can be performed in time proportional to the amount by which the table size is changing. Finally, the 
probability that any element lands in any bin is nearly uniform, both before and after resizing. 


(1+ 0(1)) cache optimality. Whereas the standard RAM model evaluates the running time of an algorithm 
in terms of the number of operations performed, the External Memory (EM) model [2| measures performance 
in cache misses (sometimes called block transfers or I/Os). The EM model has two parameters, the size M of 
the cache and the size B of a cache line (both measured in machine words). 

Any constant-time hash table trivially incurs O(1) cache misses per operation. Jensen and Pagh 
showed that a much stronger guarantee is possible: there is a constant c such that if M > cB, one can 
implement a hash table having load factor 1 — O(1/V B) and supporting each operation with 1 + O(1/VB) 
expected cache misses (at the cost of some extra computation). 

In the case where B < log? n/loglogn, Iceberg hashing achieves nearly as strong a guarantee on cache 
misses, while also reducing the computational cost to O(1). Specifically, using a cache of size M > polylogn, 


6 Although considers only insertions (and no deletions), the same basic approach can be made to work with deletions, 
using the allocate-free version of their techniques (see Section 7 of [34]). 


an Iceberg hash table achieves a load factor of 1 — O(./log B/V B) with 1 + O(1/WB) expected cache misses 
per operation 


Very-high probability constant-time guarantees. Goodrich et al. consider the problem of 
achieving subpolynomial probabilities of failure in a constant-time hash table. They note that modern hash 
tables have two sources of failure: failures due to hash functions being not sufficiently random; and failure due 
to the design of the table itself. Failures of the former type stem from the fact that the best known families 
of hash functions [{16]/38)[48] make use of expander graphs, the deterministic construction of which remains 
one of the longest-standing open problems in extremal combinatorics. As noted by Goodrich et al. 21/22], 
however, it is nonetheless possible to isolate out failures of the second type by simply assuming access to a 
fully random hash function. Under this assumption, the authors construct the first hash table to have 
a subpolynomial failure probability, specifically achieving a 1/ 2polylogn probability of failure with a load factor 
of 1 — e for an arbitrarily small constant € > 0. 

Iceberg hashing matches this probability guarantee with an interesting twist: if there exists a hash table 
with a lower failure probability p and that supports a constant load factor, then Iceberg hashing can be 
automatically improved to have failure probability O(p) + 27”/P°lylosn (and without compromising any of 
the other guarantees on space efficiency, cache efficiency, dynamic resizing, and stability). 

The smallest achievable value of p remains an open question. The hash table of [22] achieves p = 1/2P°!¥! 
which is the state of the art. We show that, in the common case where keys are O(log n) bits, a substantially 
smaller failure probability of p is achievable. In particular, we give a simple data structure that achieves failure 
probability p = 1/ gn (for a positive constant £ of our choice). This, in turn, implies that the same failure 
probability can be achieved for Iceberg hashing in this case. 

As we shall discuss later, all of the properties of Iceberg hashing besides very-high-probability guarantees 
can be implemented using known families of hash functions (assuming the description bits of the hash function 
are cached). The known results on very-high-probability guarantees (including ours) all require access to fully 
random hash functions (or other families of hash functions that are not yet known to exist). Removing this 
requirement remains an interesting direction for future work. 


ogn 
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1.2 Paper Outline 


The rest of the paper proceeds as follows. 


e Section2] proves the Iceberg Lemma. 


e Section] presents a basic version of the Iceberg hash table that is space efficient, cache efficient, and stable. 
Subsequent sections then build on this basic data structure to achieve further guarantees. 


e Section |4|shows how to perform fine-grained dynamic resizing on an Iceberg hash table, using waterfall 
addressing. We show how to achieve space-efficient dynamic resizing without compromising other properties, 
such as cache performance. 


e Section [5] extends the parameter range in which Iceberg hashing can be implemented in order to allow for 
further improvements to space efficiency. This results in a load factor of 1 — O(log log n/ logn). 


e Section[6]considers the problem of achieving subpolynomial failure guarantees assuming fully random hash 
functions. We show how to implement Iceberg hashing for O(log n)-bit keys in a way that achieves failure 
probability 1/2”. 

e Section [/]uses quotienting to make Iceberg hashing into a fully succinct data structure, meaning that the 
space consumption is (1 + o(1)) times the theoretical optimal. As in past work [4][34], we focus on the case 
where keys are O(log n) bits. This results in the first succinct dynamic hash table to support constant-time 
operations (with high probability) and waste only O(log log n) bits of space per key when compared to the 
information-theoretical optimum. 


e Section|8]presents an overview of related work. 


TThe upper bound on B is necessary given that achieving the same space efficiency for larger B would require further 
improvement on the state of the art for hash-table space efficiency in the RAM model. 
y 


e Finally, Appendix[A]gives explicit families of hash functions that can be used to implement Iceberg hashing. 
As is the case for past hash tables, this introduces a 1/ poly n probability of failure. It remains an open 
question whether smaller failure probabilities can be achieved with constant-time hash-table operations. 
(For some partial progress on this question see [22].) 


2 Iceberg Lemma 


As discussed in the introduction, a common technique for implementing space-efficient hash tables is to use 
a front yard data structure for most elements and a backyard data structure on a small subset of overflow 
elements. Since the backyard is so small, the data structure used to implement it need not be as space efficient. 

This section considers the question of what happens if the backyard is unmanaged, meaning that once 
an element is placed into the backyard, it is not moved back to the front yard, even if space frees up in the 
appropriate bin. 


The Iceberg Game. We capture the problem formally with what we call the ICEBERG GAME, which ignores 
the structure of the backyard but allows us to bound its size. The ICEBERG GAME considers n bins and 
a universe U of balls. A sequence of ball insertions and removals are performed over time, with the only 
constraints being that there are never more than m = hn balls in the system at any given moment, and that 
all balls in the system are distinct. Whenever a ball is inserted, it is hashed to a random bin (if the same ball is 
inserted, deleted, and later reinserted, the same bin assignment is used). If the bin being inserted into contains 
more than h+7, balls, where Tp is a parameter we will set later, then the ball is labeled as exposed Ë] Note that 
the number of exposed balls at any time is an upper bound for the number of balls that would be in a backyard. 

The ICEBERG GAME takes an intentionally liberal approach to labeling balls as exposed. In particular, it 
would be natural to consider exposed balls as residing in a backyard, and thus not counting towards the fills of 
the bins in the front yard. On the other hand, in the ICEBERG GAME, we intentionally count all balls towards 
the fills of bins. As we shall discuss in more detail later, this ensures the following useful property: whether 
a given ball is exposed or not depends only on the set of balls present during the insertion, rather than on the 
entire history of the system. 

The threshold T, needs to be chosen so that it is small enough to make the resulting hash table space 
efficient and large enough to make the backyard small. If h < polylogm, which is the relevant parameter 
regime for hash tables, then will show that 


th = k - (hlogh)*/?, 


for large enough constant k, is a good choice. For the sake of cleaner calculations, we also define an additional 
parameter c = k?/3 which we will use in the analysis. 


Bounding the number of exposed balls. An adversary that wishes to force a state where almost all balls 
were exposed should seek to delete non-exposed balls and insert new balls (or reinsert the old ones), hoping 
that the new ones become exposed. The following lemma shows that it is almost impossible for an oblivious 
adversary to achieve this goal. We use the convention that an event happens with super-high probability 
(w.s.h.p.) in n if it happens with probability at least 1 — 27”/ Polyles”, 


Lemma 1 (Iceberg Lemma). As long as h < polylogm, then at every point in time the number of exposed 
balls is at most n/ poly h w.s.h.p. in m. 


In this section we assume h < polylogm. This means that w.s.h.p. in n is equivalent to w.s.h.p. 
in m, so for now on we will simply say w.s.h.p. without specifying the variable. For the same reason, 
n/ polylogn = m/ polylog m everywhere. 

We remark that if h > polylog m, and we were to set T, = hi/2+€ for e > 0, then a result analogous to the 
Iceberg Lemma (but only w.h.p. rather than w.s.h.p.) would be immediate because standard Chernoff bounds 
would show that there are no exposed balls, w.h.p. What makes the h < polylogm case interesting is that 
there are (almost certainly) going to be exposed balls, but we want to show that there will not be too many. 


8One should think of the balls in the ICEBERG GAME as stacked up inside the bins in the order of arrival, forming an “iceberg”: 
balls at height at most h + T, are below the sea level, and all other balls are afloat and exposed. Since balls can be deleted, 
some exposed balls may sink below sea level, but they will retain the exposed label. In other words, this label doesn’t refer to the 
current location of a ball in the iceberg layout, but rather to whether it was exposed upon insertion. 


Proving the Iceberg Lemma 


A simple argument to achieve a w.h.p. bound. We begin by observing that there is a very simple 
argument that can be used to prove a w.h.p. version (rather than a w.s.h.p version) of the Iceberg Lemma for 
any sequence of poly n ball insertions/removals. The argument goes in three steps. First, we partition the bins 
into groups of size n£, and argue that, w.h.p., each group always has at most nê + n?®/3 balls that hash to it at 
a time; the rest of the argument conditions on a fixed outcome of which group each ball hashes to. Second, we 
use linearity of expectation to bound the expected number of exposed balls for each group of bins. Finally, we 
use the fact that, once we have conditioned on which balls hash to which groups, the numbers rj, r2,..., Mn1-e 
of exposed balls that are in each of the n!~* groups are independent random variables with values in the range 
[0, O(n*)]; applying Hoeffding’s inequality, we can conclude that the total number X = >, r; of exposed balls 
is tightly concentrated around its mean, w.h.p. 

This basic technique of breaking the bins into groups, conditioning on how many balls hash to each group, 
and analyzing the groups independently, is a classic approach for handling dependencies in balls-and-bins 
games (and has been used, for example, to handle limited independence in hash tables that require high inde- 
pendence and to construct quotient-friendly families of permutation hash functions [12]). The limitation 
of the technique, however, is that it achieves much weaker probability bounds than the super-high-probability 
bounds that we want for the Iceberg Lemma. 

In order to achieve tight probabilistic bounds, we will need to take a more sophisticated approach that 
analyzes all of the balls/bins together, and carefully handles the subtle interdependencies between operations 
in the operation sequence. By allowing for an unmanaged backyard, while also establishing super-high 
probability bounds, the Iceberg Lemma will allow for us to achieve both stability and super-high-probability 
guarantees in the Iceberg hash table. It turns out that the strong probability bounds offered by the Iceberg 
Lemma also enable applications of the lemma to other areas in data structures—we outline a number of such 
applications in a subsequent paper [5]. 


Notation. Before we discuss the w.s.h.p. analysis, let us take a moment to define some notation. Let t be 
any fixed time step. Let A = {a1,...,@m’} be the set of balls present in the system at time t. Let t; be time 
where a; was most recently inserted, and let T = {t1,...,tm’}. Let B be the set of balls other than those in A 
that are present at any t; € T and denote the balls in B by by, b2,...,b| 5). Observe that m’ < m, since there 
are there are at most m balls present at time t, and that |B| < O(m?), since there are at most m balls present 
during each time t1, ta,..., tm. 

Let a = (a4,...,Qm’) be the bin choices of the a;’s, and let 6 = (61, ..., |B|) be the bin choices of the 
b;’s. Finally, let X; be the indicator variable that is 1 exactly if ball a; is exposed at its insertion time t;. Then 
X = )°, X; is the number of exposed balls at time t. We want to prove that X < n/ poly h w.s.h.p. 


The difficulty of performing a tight probabilistic analysis on X. Roughly speaking, the main challenge 
in the analysis stems from the fact that each ball a; may be present during an arbitrary subset of past time 
steps in T, since balls can be deleted and reinserted. This means that the state of the system at steps before 
ti may already depend on the randomness of a;. 

Since X; depends on the balls present at time t;, it follows that X; depends on the bin choice of a; for every 
j such that t; < ti, and thus X; depends on X;. On the other hand, because a; may have been present at time 
tj (before being removed and subsequently reinserted at t;), Xj may also depend on X;. In particular, this 
latter type of dependency implies that we cannot treat a; as choosing a bin uniformly and independently at 
random at time ti. 

Given that the X;’s are not independent, it is natural to hope that they might nonetheless be stochastically 
dominated by a sum of independent 0-1 random variables Y,,..., Ym. In particular, one can show that, 
w.s.h.p., at every time step t,,...,tm/ there is at most a small fraction p of bins that have load above h + Tp. 
This suggests that, perhaps, the X;’s should be stochastically dominated by independent Y;’s each with mean p. 

Perhaps surprisingly, this stochastic-dominance approach does not work (even w.h.p.), as one can see with 
the following example, which highlights some of the subtle dependencies between X;’s. Consider the basic set- 
ting in which m = 2 and balls are labeled as exposed if they land on top of another ball (note that, in this case, we 
have p = 1/n). The adversary performs the following sequence of operations on two balls, a and b: (1) insert a, 
(2) insert b, (3) delete a, (4) insert a. Let X; indicate whether b is exposed at step 2, and let Xa indicate whether 


a is exposed at step 4. Both of X; and X% are 1 exactly when a and b choose the same bin, and thus X1 = X2 
deterministically. Since Pr[X, + X_ = 2] = p, the random variables are not dominated by independent random 
variables Y1, Yə with mean p. This example can be extended to an arbitrary m and threshold h + Th, by adding 
m — 2 redundant balls before step 1, and then replacing them with another m — 2 balls before step 3; note that 

in general, this does not result in X, = X2, but instead in a subtle positive dependence between X; and X2. gl 


Using McDiarmid’s Inequality. In order to prove the Iceberg lemma, we first discuss a useful inequality: 


Theorem 1 (McDiarmid’s inequality [36]). Let X1,..., Xp be independent random variables taking values 
from an arbitrary universe U. Let F : UF + R. Suppose F satisfies the following Lipschitz condition: there 
exists a real number £ (the Lipschitz bound), such that for alli € |k], £1,..., £n, ĉi E€ U, 


F Bi eg tiy e Ch) =F (tiraa Zi, arg @R)| SS 


Let X = F(X1,..., Xp). Then, for allb > 0, 


Pr[X > E[X] + b] < exp (75) . 


In particular, if 2 < polylogk then X < E [X] + k/ polylog k w.s.h.p. in k. 


What happens if we try to apply McDiarmid’s inequality to the random variable X as a function F of the 
a;’s and ĝ;’s? This comes with two issues: the first is that there are up to O(m?) different 6;’s, meaning that 
k is O(m?) (which is too large to be useful); and the second is that the Lipschitz condition ends up not being 
satisfied (although, as we shall see, it is “close” to satisfied). 


A two-phased analysis. To enable the use of McDiarmid’s inequality, we will break the analysis into two 
phases. In the first phase, we will consider an arbitrary 3 and analyze the random variable X | 8, that is, the 
random variable X in which we are using a predetermined £ (so the only remaining randomness is in q). Since 
a has dimension only O(m), (with a few additional ideas) we can use McDiarmid’s inequality to show that, 
for any fixed 3, X | 8 is tightly concentrated around its mean E[X | 8]. 

The second phase of the analysis will then bound E[X | 8] as a random variable that depends on 8’s ran- 
domness. Although E[X | 8] is a random variable (as a function of 8), the fact that it is also an expectation (as 
a function of a) will allow for us to use linearity of expectation in order to avoid any complications having to do 
with dependencies across time. Leveraging this, we will show that E[X | 8] is tightly concentrated around E[X]. 

Combining together the two phases of the analysis, we will finally be able to conclude that X is tightly 
concentrated around E[X]. We now perform the first phase of the analysis. 


Claim 1. For any value of 8, the random variable X | 8 satisfies 


X |B <E[X | 6] +n/polylogn 


w.s.h.p. 


Proof. How much is the value of X affected when a single a; changes? The answer is at most the number 
of balls present at time t that chose either the old or the new value of a; 4 Unfortunately, there could be as 
many as m such balls, meaning we cannot directly apply McDiarmid’s inequality. 


9We caution that these dependencies can be quite tricky to reason about. For example, in past work there are several examples 
where authors attempted to use an unmanaged backyard [6][8], and either incorrectly assumed independence between X;s [6], or 
attempted to perform an erroneous stochastic-dominance argument as described above [8]|—fortunately, this issue is not a big deal in 
either case, since in both cases it is straightforward to manage the backyard in question in order to fully recover the claimed results. 
10 At first glance, the effect of changing a single a;, that is, changing the bin to which ball a; hashes, would seem to only change 
X by at most 1. But in fact, the effect can be much larger. Suppose that the new choice of a; is a bin that has h + Tp — 1 balls 
in it. Placing ball a; in that bin now fills the bin. Suppose now that there is a sequence of interleaved insertions of new balls 
and deletions of unexposed balls from this bin. With ball a; in this bin, all the new balls are exposed, and had a; not been in the 
bin, none of the new balls would have been exposed (because the alternating insertions and deletion would have kept the bin just 
under capacity). So the worst-case effect on X of changing a; could be as large as m. 


Define Y; = $a. —; X; to be the number of exposed balls in bin i at time t, so X = 7, Y;. Take q to bea 


Agar 


sufficiently large constant and consider 


X' =X min (Yj, log’ m), 


t 


that is, a truncated version of X where each bin can contribute at most polylogm balls. The vari- 
able X’ | 8 is a function of a with Lipschitz bound € = log’m. Hence, by McDiarmid’s inequality, 
X'| 8 <E[X’ | 6] + n/ polylogn, w.s.h.p.. 

To complete the proof, we show that, w.s.h.p., 


X |B < X' | B-+n/polylogn. (1) 
In particular, this would mean that, w.s.h.p., 


X |8 < X'|6+n/polylogn 
E[X' | 8] +n/polylogn 
E [X" | 6] + n/ polylog n. 


IN IA 


We now prove (I). For each bin 2, define W; to be 
W; = max (0, Hj | Qj = i}| = log? m) ) 


that is, if more than log’ m balls {aj} land in bin 7, then W; counts the number of excess balls. By 
design, X — X’ < 5°, Wi deterministically. Notice, however, that 5°; W; is a function of the m in- 
dependent random variables a = {a;} with Lipschitz bound £ = 1, and that E[}7;W;] = o(1) (since 
by a Chernoff bound each W; = 0 w-h.p.). Thus we can apply McDiarmid’s inequality to deduce that 
Pr[S>, Wi | 8 > n/ polylogn] < 27”/Polyloen, completing the proof. Oo 


We next turn to the second phase of the analysis, which is to prove a concentration bound on the random 
variable E[X | 3] (whose outcome depends only on the randomness in 3). Say that a bin is heavy (at a given 
point in time) if it contains at least h + Ta balls. Say that a time step is bad if there are more than 2n/h° 
heavy bins at that step, and otherwise we say it’s good. The reason for these names is that, if a ball is inserted 
during a good step, then its probability of being exposed is at most 2nf ht = 2/h°. 

Let B; be the indicator random variable that is 1 if and only if t; is a bad step, and let B = J; Bi. 


Claim 2. For any choice of B, we deterministically have that 


E [X | 8] <2n/h°-1+E[B | 8]. 


Proof. Let F; be the event that a; is labeled exposed at t;. By linearity of expectation, we have 


EIX | 6] = $ EIX: | 6] = X PrF; |8]. 


i 


By considering whether each step is good or bad, we can decompose this as 


5 (Pr [t; good | 6] - Pr [F; | 2, ti good] + Pr [t; bad | 6] - Pr [F; | 8, ti bad]) 


a 


< X (Pr[F; | 8, ti good] + Pr [t; bad | 8]) 


t 


< X (2/h° + Pr [t; bad | £]) 


=2n/ho' +E [B | Bl. 


The following claim shows that w.s.h.p. there 


Claim 3. Any fixed time step is good w.s.h.p. (with probability taken over both a and 8). 


are no bad steps. 


Proof. Let Lj be the load of bin j at the fixed step. This is a binomial random variable with mean at most h. 


Let £ = (3clogh)'/?/h1/?. Then, 


Pr[Lj > h+ m] =Pr |L; > h+ 


T+ 
1+ 


(3clog h)!/?h!/? 


e)h| 
e) E [L;] 


(by the choice of Tp) 


(as E [Lj] < h) 


(by a Chernoff bound, and since E [L;] < h) 


(2) 


Let Z; be the indicator variable that is 1 if and only if when L; > h +T. Then, Z = > j Zj is the number 
of heavy bins. By linearity of expectation and Equation (2), E [Z] < n/h°. Since n/h° > n/polylogn, and 
O 


since the Z;s are negatively associated, a Chernoff bound implies that Z < 2n/h° w.s.h.p. 


We now use Claim[]to bound E [B | 8], as follows. 


Claim 4. E[B | 8] = 0 w.s.h.p. (with randomnes 


Proof. We have that 


E [E [B | 8] 


Thus, by Markov’s inequality, Pr [E |B | 6] > 1] < 


s taken over 3). 


] = E [B] 
= 3 0 [Bi] 
< 5 io polylogn 


= 1/2”/ polylog n | 


Finally we put the two phases together to complete the proof. 


Proof of Lemma{i] We have, w.s.h.p., 


X <E[X | 6] +n/polylogn 


< 2n/h°7t + 


<2n/ho 4 
< n/ poly h. 


3 Basic Iceberg Hashing 


E [B | 8] + n/polylogn 
+ n/ polylogn 


my [E [B | fall < 1/2°/ polylog a, 


(by the tower rule) 
(by linearity) 


(by Claim B) 


(by Claim [} 
(by Claim B) 
(by Claim [4) 


oO 


In this section, we consider the problem of constructing a space-efficient hash table with constant-time oper- 
ations, referential stablility, and nearly optimal cache behavior. Our solution is the most basic version of an 
Iceberg hash table; in subsequent sections, we will show how to modify the table to achieve stronger guarantees. 

The lemmas in this section will assume access to constant-time fully random hash functions; we discuss 
the use of explicit families of hash functions at the end of the section. 


The structure of an Iceberg hash table. Let N be an upper bound on the current number of keys n in the 
table, let U be the universe of keys, and let h be a parameter satisfying h < O(log N/ log log N); we call h the 
average-bin-fill parameter. Let T be an arbitrary hash table implementation that supports constant-time 
operations (w.h.p.) and load factor at least solr (i.e, the table can store n records in space n poly h). We can 
further assume without loss of generality that 7 is stable, since any hash table can be made stable by adding 
an extra level of indirection to the records (at the cost of a constant-factor loss in load factor and an extra 
cache miss per operation). Iceberg hashing can be viewed as a technique for transforming T into a new table 
that is space-efficient, cache-efficient, and stable. 

The Iceberg hash table consists of a front yard and a backyard. The front yard consists of N/h bins, each of 
which has capacity h + tp (recall from Section2]that r = k - (hlogh)!/? for some constant k). The backyard, 
which will store only a small number of records (roughly N/ poly h) is implemented using T. 

The front yard uses two hash functions: the function bin : U + [N/h] maps keys to bins, and the function 
fp : U — [poly h] maps keys to random O(log h)-bit fingerprints. 

When a new key z is placed into the table, we first try to place it into its front-yard bin(x). If the bin(x) 
contains fewer than h + Tp records, and all of the records y in the bin satisfy fp(x) # fp(y), then x is placed 
into the bin] Otherwise, x is placed into the backyard T. 

The insertion procedure ensures that, within each bin, the records all have distinct fingerprints. This 
enables a simple space-efficient scheme for performing queries within the bin. Define a routing table to be a 
dictionary that maps up to h + ma different fingerprints to indices į € [h + Ta] within a bin. As we shall discuss 
shortly, as long ash < O(log n/ log log n), a routing table can be encoded in O(1) machine words (and O(A log h) 
bits) with constant-time query /insert/delete operations (and with no cache misses). The routing table is used 
within each bin to map the fingerprint fp(x) of each key x to the corresponding position of x in the bin. 

Each bin 6 also maintains several other pieces of metadata: a fill counter keeping track of the number 
of records in the bin, a vacancy bitmap keeping track of which slots are vacant in the bin, and a floating 
counter keeping track of how many keys x in the backyard satisfy bin(x) = b. 

The fill counter and vacancy bitmaps are used to implement insertions in constant time. The floating 
counter, on the other hand, is used to make queries more cache efficient. If a query for a key x goes to a bin 
b whose floating counter is 0, then the query need not search for x in the backyard. As long as the bin fits in 
a Single cache line, then the query to x incurs only a single cache miss. 

In order to complete the description and analysis of Iceberg hash tables, we have two main tasks: to show 
that routing tables can be implemented in O(1) machine words with O(1)-time operations, which we do via 
standard bit techniques; and to show that the backyard remains small, even though records are never moved 
from the backyard to the front yard, which we do via the Iceberg lemma. 


Implementing arouting table with O(1) machine words. Let a1, a2,...,a, be aset of distinct fingerprints 
stored in arouting table, and let b1, b2, . . . , br be the corresponding indices of the fingerprints within the bin — the 
routing table maps a; to b;. Let A be an array storing a1, a2,...,a,, and let B be an array storing b1, bo,..., br. 
Note that A and B can be stored in O(1) machine words using O(h log h) bits, sincer = O(h) and each element in 
each array is O(log h) bits. Since h = O(log n/ log log n), we have |A| = |B| = O(hlog h) = O(logn) = O(w), 
where w is the machine word size. The routing table simply stores A and B, for a total of O(1) words. 

Queries to the routing table face the challenge of determining whether a fingerprint a is in the array A, and 
if so, then the query must also return b; for the index i such that a; = a. Fortunately, these operations can be 
implemented in constant time using standard bit techniques. 

In the following, we will exploit the fact that several useful word operations can be performed in constant 
time. In all our word operations, we will operate on small integers stored in a single word. 

When thinking of an integer x as a bit string, we will treat it as being right justified, meaning that x’s 
least significant bit is the final bit of the bit string. We denote the concatenation of two bit strings x and y by 
roy =a2l4lt1+y, That is x oy is the bits of x, followed by a padding bit 0, followed by the bits of y (reading 
from most to least significant bit). We say that a1,...,a, are packed into a word A, if A = a, ©0---oag, 
and we call the bit before a; the ith padding bit. 

The proofs of Lemmas[2][9] and[[]use the following standard set of tools: 


1. Given a word, the position of the least significant 1-bit can be computed in O(1) time : 


11 For convenience of notation, we will often treat a record as a key (rather than a key-value pair), allowing for us to, for example, 
talk about the fingerprint fp(x) for a record a. 
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2. Given a word, the position of the most significant 1-bit can be computed in O(1) time [20]. 

3. Given a bit string a of at most (w/k) — 1 bits, the word A = ao---o a consisting of k copies of a can 
be computed in O(1) time [20]. 

4. Given two sets of bit strings {71,...,2,} and {y1, . - - , Yk}, where |x;| = |y;| < (w/k) — 1 for all i,j € [k], 
let X = z1 0---o £k and Y = yı 0---0 yx, and let p; be the location of the ith padding bit in both 
X and Y. Then in O(1) time [20], we can compute a word Z where for i € [k], the p;th bit of Z is 1 if 
zi > yi and 0 otherwise (and the bits not corresponding to padding-bit locations p; are 0). That is, we 
can compare every x; and y; in O(1) time. 


Lemma 2. Let x1, 22,...,2, be bit strings of length s < (w/r) —1, let X =a ,0---o2,, and lety be an s-bit 
number. In constant time, one can determine whether y € {x1,...,2,}, and for what index i we have x; = y 
(if such ani exists). 


Proof. We use the standard techniques outlined above. We pack r copies of y in a new word Y. Compare X 
with Y, compare Y with X, and AND together the resulting comparison-indicator words. The result yields an 
equality-indicator word Z (that is, the ith padding bit of Z indicates whether x; = y). We find which z; = y, 
if any exists, by finding the least significant 1-bit of Z. O 


The simplicity of the routing table’s encoding makes insertions and deletions of fingerprints easy to 
implement in constant time. In particular, insertions and deletions simply need to update a single entry in 
each of the arrays A and B. 


Analysis of Iceberg hashing. The challenge in analyzing the Iceberg hash table is to bound the number of 
records in the backyard. There are two types of keys x in the backyard: keys x that were placed in the backyard 
due to lack of space in bin(a), and keys « that were placed in the backyard due to a fingerprint collision with 
another key y in bin(a). We refer to keys of the former type as capacity floaters and keys of the latter type 
as fingerprint floaters. As we shall see, the number of capacity floaters can be bounded by the Iceberg 
Lemma, and the number of fingerprint floaters can be bounded by an analysis using McDiarmid’s inequality. 

Although for now we are only interested in h < O(log N/ log log N), later in the paper we will also consider 
even more space efficient variants of Iceberg hashing in which h is larger. To simplify discussion later, we state 
several of the lemmas in this section for arbitrary h < polylog N. 

We begin by bounding the number of fingerprint floaters. We remark that the proof of the next lemma 
requires a bit of care to avoid any potential subtle circular dependencies between the random variables being 
analyzed. 


Lemma 3. Suppose h < polylogN. Then w.s.h.p. in N, there are at most N/ poly h fingerprint floaters. 
Moreover, for a given key x, the probability that there is a fingerprint floater y such that bin(a) = bin(y) is at 
most 1/ poly h. 


Proof. Let t denote the current time, and let X denote the set of keys present at time t. For each key x € X, 
let A, denote the event that (bin(x), fp(a)) = (bin(y), fp(y)) for some y € X \ {x}, and let Bẹ denote the event 
that (bin(x), fp(x)) = (bin(y), fp(y)) for some y ¢ X such that y was present in the table when x was inserted. 
The total number of fingerprint floaters is upper bounded by 


>> Ac + D> Ba, 


Tex LEX 


where A, and By are treated as indicator random variables. 

The function A = J „ex^s is determined by the |X| < MN independent random variables 
{(bin(x), fp(x)) | « € X}, and A has Lipschitz bound £ = 1. By McDiarmid’s inequality, it follows 
that A < E[A] + N/polylog N, w.s.h.p in N. On the other hand, since every z,2’ € X collide in their 
bin-choice/fingerprint with probability + : ay each event A, occurs with probability at most 1/ poly h, 
which means that E[A] < N/ poly h. Thus, w.s.h.p. in N, we have A < N/ poly h+ N/ polylog N < N/ poly h. 

Next we analyze B = } „ex Bz. Let Zy be the outcome of (bin(y), fp(y)) and let Z = {Z, | y g X}. If 


we condition on any fixed Z, then the random variables {B, | £ E€ X} become independent. It follows by a 
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Chernoff bound that B | Z < E[B | Z] + N/polylog N, w.s.h.p in N. Define Ty to be the set of elements 
present when a given x is inserted. No matter what the outcome of Z is, we have by a union bound that 


RIB | Z] < $. J Pr{(bin(2), fp(x)) = (bin(y), fp(y)) | Z] 
TEX Gas 


= 5 5 FEIT < N/ poly h. 


LEX yX 
yeTx 


Thus we have B < N/ poly h + N/polylog N = N/ poly h, w.s.h.p. in N. 

So far we have shown that, w.s.h.p. in N, there are at most N/ poly h fingerprint floaters at time t. It 
remains to show that, for a given key x, the probability that there is a fingerprint floater y at time t such that 
bin(x) = bin(y) is at most 1/ poly h. 

The probability that x itself is a fingerprint floater is at most 1/ poly h, and similarly the probability that 
there is any y at time t such that (bin(y), fp(y)) = (bin(«), fp(x)) is at most 1/ poly h. To complete the proof, 
we must bound the probability that there exists a fingerprint floater y at time t such that bin(a) = bin(y) and 
such that, when y was inserted, there was a record z # x present such that (bin(y), fp(y)) = (bin(z), fp(z)). 
We know that, w.s.h.p. in N, there are at most N/ poly h keys y such that when y was inserted, there was a 
record z Æ x present such that (bin(y), fp(y)) = (bin(z), fp(z)). Record x has probability h/N of satisfying 
bin(a) = bin(y) for each of these y’s. By a union bound, the probability of x satisfying bin(x) = bin(y) for any 


such y is at most O (4 . A) = 1/ poly h, which completes the proof. O 

Say a record x is a capacity exposer if, when x was inserted, there were already at least h + Tp records y 
present (including the records in the backyard) such that bin(x) = bin(y). Rather than analyzing the number 
of capacity floaters directly, we instead analyze the number of capacity exposers (although this distinction is 
not important now, it will be later in our analysis in Section[4). 


Lemma 4. Suppose h < polylog N. Then, w.s.h.p. in N, there are at most N/ poly h capacity exposers in the 
table. Moreover, for a given key x, the probability that there is a capacity exposer y such that bin(y) = bin(x) 
is at most 1/ poly h. 


Proof. By the Iceberg Lemma, the number of capacity exposers at any given moment is at most N/ poly h 
w.s.h.p. in N. 

It remains to show that, for a given key x, the probability of bin(a) having a capacity exposer is at most 
1/ poly h. Let A denote the set of keys y present at time t such that, when y was inserted there were at least 
h + Ta — 1 other keys z in bin(y) satisfying z 4 x. By the Iceberg Lemma (applied using T}, < Ta — 1), |A] 
is at most N/ poly h w.s.h.p. in N. The probability that bin(x) contains any elements from A is therefore 
1/ poly h. On the other hand, in order for bin(x) to have a capacity floater y 4 x, we must have y € A. Thus 
the probability of bin(x) having a capacity floater is at most 1/ poly h, completing the proof. Oo 


Combining the preceding lemmas, we analyze the backyard. 


Lemma 5. Suppose h < polylog N. Then, w.s.h.p. in N, there are at most N/ poly h records in the backyard. 
Moreover, for a given key x, the probability that bin(a) has a non-zero floating counter is at most 1/ poly h. 


Proof. Let t be the current time. By Lemmaf3] w.s.h.p. in N, there are at most N/ poly h fingerprint floaters 
at time t. Also by Lemmal3] the probability that a given record x hashes to a bin(x) for which there is at least 
one fingerprint floater is at most 1/ poly h. 

Since every capacity floater is a capacity exposer, we can use Lemma|4]to deduce that, w.s.h.p., there are 
at most N/ poly h capacity floaters at time t. Also by Lemma[4] the probability that a given record x hashes 
to a bin(a) for which there is at least one capacity floater is at most 1/ poly h. This completes the proof. O 


The previous lemmas all assume access to fully random hash functions. In Appendix[A] we describe how to 
modify Iceberg hashing (both the simple version described in this section and the stronger variants in subsequent 
sections) to be compatible with an explicit family of hash functions (the transformation is essentially the same 
as the one used in past works [4/34] ). The transformation preserves all of the properties of Iceberg hashing that 
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we care about (time efficiency, cache efficiency, space efficiency, and stability) but, as in previous work, this in- 

troduces an additional 1/ poly N failure probability due to the hash functions themselves. Thus, in order so that 

our analysis of Iceberg hashing is compatible with an explicit family of hash functions, we state Theorem[](as 

well as the other main theorems of the paper) in terms of w.h.p. guarantees rather than in terms of w.s.h.p. guar- 

antees. The only exception to this will be in Section[6]where we prove w.s.h.p. guarantees assuming fully random 

hash functions, building on past work [21][22] which has also assumed full randomness for the same reasons. 
We now present the full analysis of the Iceberg hash table. 


Theorem 2. Consider an Iceberg hash table that never contains more than N elements and suppose that 
the average-bin-fill parameter satisfies h = O(log N/loglog N). Suppose that the backyard table supports 
constant-time operations (w.h.p. inn), supports load factor at least 1/ poly h, and is stable. 

Consider a sequence of operations in which the number of records in the table never exceeds N, and consider 
a query, insert, or delete that is performed on some key x. Then, the following guarantees hold. 


e Time Efficiency. The operation on x takes constant time in the RAM model, w.h.p. in N,. 


e Cache Efficiency. Consider the EM model using a cache line of size B = O(h) and a cache of size 
M = Q(B), and suppose that each bin in the front yard is memory aligned, that is, each bin is stored in 
a single cache line. Finally, suppose that the description bits of the hash functions are cached. Then the 
operation on x has probability at least 1 — 1/ poly B of incurring only a single cache miss. 


e Space Efficiency. The total space in machine words consumed by the table is, w.h.p. in N, 


(+0( 


)) N = (1 + o(1))N. 
e Stability. The hash table is stable. 


Proof. The claims of time efficiency and stability follow directly from the construction of the Iceberg hash 
table. By Lemma[5] the space consumed by the backyard is N/ poly h machine words w.h.p. in N. The space 
in machine words consumed by the front yard is deterministically 


TA) 


X (h+m+0(1)) < (1+0( 


which completes the proof of space efficiency. 

Finally, we prove the claim of cache efficiency. If the operation on key x is a query, then the probability 
of incurring more than one cache miss is equal to the probability that bin(x) has a non-zero floating counter. 
By Lemma] this probability is at most 1/ poly h = 1/ poly B. By the same analysis, the probability that a 
deletion incurs multiple cache misses is also 1/ poly B. 

If the operation is an insertion, then the probability of incurring more than one cache miss is equal to the 
probability that x is placed into the backyard. This, in turn, is the probability that either (1) there is another 
record y in bin(x) such that fp(y) = fp(x); or (2) there are already h + Tn records in bin(x). The probability 
of (1) is at most N - 4 - 1/ poly h = 1/ poly h by a union bound, and the probability of (2) is also 1/ poly h by 
a Chernoff bound. 

Thus, for any operation, the probability of incurring more than one cache miss in the EM model is 
1/ poly h = 1/ poly B. Oo 


4 Dynamic Resizing with Waterfall Addressing 


In this section, we show how to transform Iceberg hashing into a dynamically resizable hash table, while 
preserving the space efficiency, time efficiency, and cache efficiency of the original data structure (and also 
preserving stability during time windows in which the table is not resized). 

The core challenges that one encounters when trying to make Iceberg hashing space- and time-efficiently 
dynamically resizable are the same as those that arise for any other direct-mapped hash table. To capture 
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this set of challenges formally, this section defines the dynamic bin addressing problem. We then give an 
efficient solution to the this problem, which we call waterfall addressing, and we show how to use waterfall 
addressing to construct a dynamically resizable version of Iceberg hashing. 


Resizing through partial expansions. If one does not care about space efficiency, then the classic approach 
to dynamically resizing a hash table is to simply rebuild it whenever its size changes by a constant factor. On 
the other hand, if space efficiency is a concern, then the hash table must be resized in smaller increments. 

Suppose that we wish to maintain a hash table at a load factor of 1 — O(1/s) for some power-of-two 
parameter s = w(1). Perhaps the most natural approach is to grow the table through small partial 
expansions. Each time that the hash table doubles in size, a total of s partial expansions are performed. We 
call s the resize granularity. If the hash table initially consists of 2° bins, then each of the partial expansions 
increases the number of bins by 2“/s, so that after s partial expansions the number of bins becomes 2°*+. The 
partial expansions are spread out over time so that the average load on each bin never changes by a factor of 
more than 1 + O(1/s) = 1 + o(1). 


The problem: dynamically mapping elements to bins. How should we map elements to bins after each 
partial expansion? If a hash table consists of m bins, and we have a fully random hash function g : U > [2] 
(for some w satisfying 2” >> m), then the classic approach to mapping elements x € U to bins [m] is to simply 
use the bin assignment function 

Binn (x) = g(a) (mod m). (3) 


The problem with this bin assignment function is that, whenever a partial expansion is performed, almost 
all of the records in the hash table will have their bin assignments changed. This means that, if a partial 
expansion is performed on a hash table with n elements, then the expansion will require Q(n) time. In contrast, 
if we wish to have O(1)-time operations, then each partial expansion must take time at most O(n/s). 


The dynamic bin-addressing problem. Let U be the universe and g : U — [2”] a fully random hash 
function, where w is the number of bits in a machine word and 2” is an upper bound on the number of bins 
that will ever be in our hash table. 

Define Ma, į = 2° +4 -2° /s to be the number of bins after the j-th partial expansion in the process of doubling 
a table from 2° to 2°*! bins. Define a bin assignment function Bin(a, j, x) : [log s, w — 1] x [s] x U > [maj] 
to be the function that assigns keys x to bins after the j-th partial expansion in the process of doubling a table 
from 2° to 2°*' bins. As an abuse of notation, we also define Bin(a, 0, x) = Bin(a — 1, s, £). 

A bin assignment function is a solution to the dynamic bin-addressing problem if it satisfies the 
following three properties. 


e The Clean Promotion Property. If Bin(a, j, x) 4 Bin(a, j + 1, x), then 
Bin(a, j + 1,2) E€ (ma,j, Ma, j+1]: 


In other words, whenever a partial expansion is performed, the only keys that move are the keys that 
are assigned to the newly added bins. 


e Independence. For any given a, j, the function Bin(a, j, x) is mutually independent across all x € U. 


e Near Uniformity. For every a, j, x and for every £ € [ma,;], 


Pr[Bin(a, j, x) = 4 = (1 + O(1/s)) - 


Maj 


Whereas independence and near uniformity are necessary for any addressing scheme (even in a fixed-size 
hash table), the clean promotion property is what glues together the outcomes of Bin(a,j,x) for different 
values of a and j. It ensures that only roughly a 1/s-fraction of elements will have their address changed by 
any given partial expansion. 

A consequence of the clean promotion property is that the functions Bin(a, j, x) and Bin(a, j + 1,2) must 
be closely related to one another. Thus a natural approach is to define the function Bin(a, j, x) recursively, so 
that Bin(a, j, x) depends on Bin(a’, 7’, x) for a’ < a and 7’ < j. In 1980, Larson gave an elegant construction 
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[31] showing that such a recursive approach is indeed possible; Larson’s scheme can be used to construct a 
solution Bin(a, j, x) to the dynamic bin addressing problem that can be evaluated in logarithmic expected 
time. Larson’s scheme has found many applications to external-memory problems, but the (log n) evaluation 
time has prevented it from being useful for internal memory hash tables. 

This section shows that, somewhat remarkably, it is possible to achieve the clean promotion property without 
recursion, and it is even possible to construct a solution to the dynamic bin addressing problem that can be evalu- 
ated in O(1) worst-case time. We call our solution, which we present in Subsection/4.1] waterfall addressing. 


Efficiently finding which records need to be moved. So far we have focused on how to map records to a 
dynamically changing set of bins, but this alone does not fully solve the problem of how to dynamically resize 
a hash table. The clean promotion property ensures that each partial expansion moves only O(n/s) records. 
But how do we efficiently locate those records without performing a full scan through the table! 

In Subsection [4.2] we show that it is possible to incorporate waterfall addressing into Iceberg hashing in 
a way that solves this problem. In particular, by adding O(log s) bits of overhead to each element in the hash 
table, we make it possible to locate in time O(n/s) which records need to be moved. Our solution is not specific 
to Iceberg hashing; it can just as well be used with any hash table that stores the majority of its elements in 
an array organized into bins. 


Incorporating waterfall addressing into Iceberg hashing. Finally, when applying waterfall addressing 
to Iceberg hashing, there our several additional technical challenges that arise. These challenges are specific 
to Iceberg hashing, and in particular, to how the probabilistic guarantees on the size of the backyard of the 
hash table interact with the dynamic resizing. We show how to solve these issues in Subsection[4.3] In doing 
so, we obtain a version of Iceberg hashing that is fully dynamic. 


4.1 Waterfall Addressing 


In this subsection, we describe a constant-time solution to the dynamic bin-addressing problem. 

To simplify discussion, we will think of the bins as being broken into chunks of E = 2° /s bins during the 
2% doubling (i.e., from 2% bins to 29+! bins). At the beginning of this doubling there are s chunks and at the 
end there are 2s chunks. Furthermore, when we refer to the size of a chunk (or of the table as a whole), we 
shall be referring to the number of bins. 


4.1.1 A starting place: Larson’s recursive scheme. 


The clean promotion property was introduced in 1980 by Larson [31], who gave an elegant technique for 
achieving the property and applied the approach to cache-efficient linear hashing. 

In Larson’s scheme, each key « has an (infinite) sequence of chunk hash functions that are used during 
the 2% doubling (i.e., from 2% bins to 27+ bins): 


g(a), gS (a), ..., 


where each g : U - [2s] maps elements uniformly to chunks. For each r € [s + 1, 2s], define G(x, r) to be 
g (x) for the smallest i such that g (x) < r. The definition of G™ (x, r) satisfies two elegant properties: that 
(a) G@) (x, r) is uniformly random in [r]; and that (b) either G (2, r +1) = G (2, r) or GO (2, r) =r +1. 

Larson’s scheme computes Bin(a,j,2) as follows. First, recursively compute p = Bin(a,0, x) to be the 
position that x would reside in if the table had 2° bins. Then set 


Bin(a,j,2) = 4 ” if GO(x,j) <s 
G(x,j)-E+(p (mod E)) otherwise. 


That is, if G@ (x, j) returns a non-expansion chunk, then we do not move the item; otherwise, we use the 
chunk hash function G( (x, j) to determine the high-order bits of the new address and the old address p to 
determine the low-order bits. 
The pseudocode for recursively computing the bin address for a record x is given by Algorithm [I] 
12Note that this is a problem that only arises for partial expansions and not for the reverse operation which is a partial 


contraction. In particular, when performing a partial contraction, the elements that need to be moved are precisely the ones that 
reside in the part of the table being eliminated. 
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Algorithm 1 Larson’s Address Computation: Computing Bin(a, j, x) 


Description: Suppose we are doubling from size 2“ to 2°*! in chunks of size E = 2° /s bins. This function 
computes record x’s bin number after the j-th partial expansion. 


1: if 2°*1 = s and j = s then return gi”) (x) end if > Base case 
2: p+ Bin(a — 1,s,2) > p is the bin assignment for x when the table was size exactly 2° 
3 tc 1 

4: while gS (2) >s+jdo 

5: i i+1 

6: end while 

7: if g® (x) < s then return p 

8: else return gS (x) -E + (p (mod £)) 

9: end if 


Lemma 6 ([31]). Larson’s address computation function Bin(a, j, x) satisfies the clean promotion property 
and maps each record x uniformly at random to a bin in [2° + jE]. 


The recursive structure of Larson’s scheme causes it to take time Q(a) (and expected time O(a)). The 
main contribution of this section is waterfall addressing, a technique that improves this running time to 
O(1) worst case. We begin by showing how to reduce the expected time to O(1). 


4.1.2 Waterfall addresses in constant expected time. 


We modify Larson’s scheme by introducing a master hash function m(x) : U — [2”]. Whenever a key x 
is moved into a new chunk by a partial expansion, we use m(x) to determine the low-order bits of x’s address, 
rather than the recursively computed p. That is, we simply set the offset to be m(x) (mod E). Algorithm 
[2] gives the pseudocode for this addressing scheme (compared to Algorithm[]] line 2 gets removed and lines 7 
and 8 get modified). 


Algorithm 2 Waterfall Address Computation: Computing Bin(a, j, x) 


Description: Suppose we are doubling from size 2% to 27+? in chunks of size E = 2% /s bins. Let m(x) be the 
master hash function. This function computes record x’s bin number after the j-th partial expansion. The 
smallest allowable table size is s. 


: if 2°71 = s andj = s then return gh (x) end if > Base case 
> DBIA NI At) > We no longer need to recursively compute p. 
i1 

: while g% (x) > s + j do 

iitl 

: end while 

: if gS (x) < s then return Bin(a — 1, s, x) > This is the only case where we recurse. 
: else return gS (a) -E + (m(x) (mod E)) > Use the master hash instead of recursing. 
: end if 


oo Nana wWNe 


We call the addressing scheme waterfall addressing because, as the table grows, more and more of the 
bits of z’s address are determined by m(x). Different records x converge towards matching m(x) at different 
rates, together forming a sort of “waterfall”. In a single table there will simultaneously be records x that agree 
with m(x) in almost all of their bits, and (far fewer) records that agree with m(x) in only a few bits (these 
records are at the “top” of the waterfall). 

When analyzing waterfall addressing, it will be useful to note that every recursively called subproblem of 
Algorithm 2] has a power-of-2 number of bins (that is, 7 = s). In this case, the pseudocode for Algorithm P] 
simplifies considerably. Since we always use g(a), rather than having to find some gs” 
skip lines 3-6 of Algorithm[2] resulting in AlgorithmB] 


(x), we are able to 
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Algorithm 3 Computing Bin(a, j, x) when j = s 
Description: We compute record x’s bin number after the s-th partial expansion. 


if 2°+1 = s then return g” (x) end if > Base case 
if g(x) < s then return Bin(a — 1, s, x) 

else return gs (2) -E + (m(x) (mod E)) 

end if 


We now give an analysis of (this basic version of) waterfall addressing. 


Lemma 7. Waterfall addresses can be evaluated in O(1) expected time (Algorithm[Z). Moreover, for a given 
record x, Bin(a,j,x) is uniformly distributed across [2° + jE]. 


Proof. We begin by analyzing the running time. Lines 3-6 of Algorithm P]take constant expected time since 
each iteration of the while loop has at least a 1/2 probability of terminating. In subsequent levels of recursion, 
the algorithm reduces to Algorithm B] which takes constant time per layer of recursion. Algorithm [3] has 
exactly a 1/2 probability of terminating in each level of recursion (because Prig” (x) < s| = 1/2). Thus the 
expected time to evaluate AlgorithmBlis also constant. 

Next, we argue that bin assignments are performed uniformly at random. Suppose by induction that this 
is true for all a’ < a. The value of gS (x) used in lines 6-7 of Algorithm B]is uniformly random in [s + j]. 
Thus, with probability s/(s + j), x is assigned to the first s chunks using the recursively computed value of 
Bin(a — 1, s,2), which we know by induction to be uniform in [2°]. On the other hand, with probability 7/s, 
x is assigned to a random one of the final 7 chunks and is then given a random offset into the chunk using the 
master hash. This process assigns x uniformly at random in [2° + 1, 2% + jE]. Combining the two cases, x is 
assigned to a random position in [2% + jE]. Oo 


4.1.3 Waterfall addresses in worst-case constant time. 


For Iceberg hashing, we want worst-case constant-time operations. To this end, we now define truncated 
waterfall addressing. 


One of the main issues with waterfall addressing (and Larson’s scheme before it) is that we may need to 


evaluate an arbitrarily long sequence of chunk hash functions g® 
the sequence {g 


. Truncated waterfall addressing truncates 
(x)} to end at i = logs. A priori, this does not necessarily seem like progress, since (a) 
Ct) > s + j for all of 
i € {1,2,...,logs}, so that the search for an address does not terminate within the first log s g9 s; and (b) 
it does not appear to get us any closer to a worst-case constant-time waterfall addressing scheme. We will 
address these issues one after another, first showing how to fix truncated waterfall addressing in the case 


it introduces an issue of what to do on a truncation overflow, that is, when g 


where there is no valid gS, and then showing how to compute truncated waterfall addresses in constant time. 


What to do on truncation overflow. When a truncation overflow occurs, we fall back to assigning x to 
reside among the first s chunks using the recursively computed Bin(a — 1, s, x). That is, if none of the g (x)’s 
are usable, then x is assigned to the same position to which it would have been assigned at the end of the 
previous doubling, when there were exactly 2° bins. Pseudocode is given in Algorithm [4] the changes from 
Algorithm [are in red. 

The next lemma establishes that the bin assignments performed by truncated waterfall addressing are 
nearly uniform, as required in the dynamic bin addressing problem. 


Lemma 8. [f there are k bins, then for each record x and each bin b, the probability that truncated waterfall 
addressing maps x to b is + - (1 + O(1/s)). 


Proof. Suppose that 2% < k < 2°+1 and let j be the number of partial expansions that have occurred, that is, 
k=2°+9E. 
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Algorithm 4 Truncated Waterfall Address Computation: Computing Bin(a, j, £) 


Description: Suppose we are doubling from size 2° to 2+" in chunks of size E = 2° /s bins. Let m(x) be the 
master hash function. This function computes record «x’s bin number after the j-th partial expansion. The 
smallest allowable table size is s. 


1: if 2°*' = s andj = s then return gi) (x) end if > Base case 
2: i4 1 

3: while gS (2) >s+j and i< logs do > Truncation condition in red 
4: i} i+1 

5: end while 

6: if g® (x) <s or i > logs then return Bin(a — 1, s, x) > Truncation condition in red 
7: else return g” (x) - E + (m(x) (mod E)) 

8: end if 


The probability that x experiences a truncation overflow is 


log s log s 
g g 1 


II Prig” (x) >s+j|< II Pr[g® (x) > s| = a= 1/s. 


On the other hand, if a truncation overflow occurs then x is assigned to bin Bin(a — 1,s,). Importantly, 
Bin(a — 1,s,x) is uniformly random in the first 2° bins, since truncated waterfall addressing and (non- 
truncated) waterfall addressing are equivalent in the case where the table size is an exact power of 2 (in this 
case, both algorithms reduce to Algorithm[). 

In summary, each key x has only a O(1/s) probability of being addressed differently by the two algorithms, 
and if x is addressed differently, then truncated addressing assigns x uniformly at random among 2% = O(k) 
bins. This implies the lemma. O 


Truncated waterfall addressing in worst-case constant time. Our final task is to compute truncated 
waterfall addressing in constant time. This will require dealing with two issues: how to efficiently find the first 
g® (x) < s+ j and how to eliminate the recursion. As we shall see, the first problem can be dealt with by stan- 
dard bit-manipulation techniques, whereas the second problem requires a more interesting algorithmic solution. 

Notice that all of g{” (x), gS” (a),... ae le) consume O(log? s) bits. As long as s is not too large (i.e., 


log? s = O(w)) it follows that the entire sequence gi (x), gS” (2), ee Ree 


machine word (using the definition of packing given in Section), which we will denote by ¢® (x). Moreover, 
we can compute the entire sequence in constant time by computing $ (x) as a single O(log? s)-bit hash of x, 
and then zeroing out every log s + 1st bit in order to add appropriate padding. 

By performing bit manipulation on 6 (x), we can perform Lines 3-6 of Algorithm[4Jin constant time. 


(x) can be packed into a single 


Lemma 9. Let r and b be integers so that r(b +1) < w. Let ¢1, ¢2,...,, be b-bit numbers packed into word 
@, and let q be a b-bit number. In constant time, one can determine the minimum i such that di < q, or return 
i = —1 if no such i exists. 


Proof. The proof follows the same approach as Lemma P] Pack k copies of q in a new word Q. Compare Q 
with ¢ and return the most significant 1-bit of the resulting indicator word. O 


Although lines 3-6 of algorithm[4]can be evaluated in constant time, there is still the issue of the recursion 
on line 7 causing a potentially superconstant running time. Since the recursive subproblems are always on a 
power-of-two number of bins, the challenge becomes to evaluate Algorithm[B]in constant time. 

Define the promotion sequence P(x) for x to be the indicator word where 


Bay 1 if g® (x) >s 
i 0 otherwise. 
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Another way to view Algorithm Bl]is that we are finding the largest a’ < a such that P, (x) = 1, and we are 
then returning 
g (2): E' +(m(x) (mod BE’) 


where E’ = 2% /s. Thus, the task of computing Algorithm BJ]reduces to the task of computing a’. 

If we were given the promotion sequence P(x), then we could determine a’ in constant time by standard 
bit manipulation (see the discussion of bit manipulation in Section B). The problem is that P(x) consists of 
one bit from each of 6) (x), 6) (x), ..., each of which individually takes constant time to compute. 

To fix this problem, we introduce one final algorithmic idea, reversing the relationship between P(x) and 
o (x), 2) (x),.... Let P(x) be the output of a random hash function, and let Y® (x), Y® (x£), ... be hash 
functions each of which maps x to a word that packs log s hashes of log s bits each (for a total of log s(1 + log s) 
bits). Then we define 6 (x) to equal Y® (x) except with its most significant bit (i.e., the most significant bit 
of gP (x)) overwritten by P,(x). That is, rather than using one bit from each ¢® (x) to determine P(x), we 
use P(x) to determine one bit in each 6 (x). Importantly, the construction of 6 (x) is overwriting the most 
significant bit of p(x) with a random bit, so 6 (x) is still random. On the other hand, the construction 
makes it so that P(x) is just a hash of x and can be computed in constant time. Using P(x), we can then 
determine a’ in constant time using standard bit manipulation, as desired. 

Putting the pieces together we arrive at the following theorem which establishes that truncated waterfall 
addressing is a constant time solution to the dynamic bin addressing problem. 


Theorem 3. Suppose that log s(1 + logs) < w where w is the machine word size. Then, truncated waterfall 
addressing can be computed in constant time, satisfies the clean promotion property, and selects each of k bins 


with probability 1/k- (1 + O(1/s)). 


Proof. When evaluating Algorithm [B] we can use Lemma D]to perform the first level of recursion in constant 
time. The next level of recursion is guaranteed to be in the case where there are a power-of-2 bins. This case 
can be evaluated in constant time using the promotion sequence. 

The fact that truncated waterfall addressing satisfies the clean promotion property follows from the 
definition. The fact that bin assignment is nearly uniform follows from Lemmal[g] O 


4.2 Determining Which Records to Move 


The clean promotion property ensures that, when a partial expansion occurs, the number of records whose 
address changes will be roughly a 1/s fraction of all records. If there are n total records, then we wish to be 
able to identify which records to move in time O(n/s). This means that we cannot simply traverse the table 
to find the records. 

In this subsection, we show how to add a small amount of metadata to each bin so that we can efficiently 
detect which records to move during a partial expansion of truncated waterfall addressing. For simplicity, we 
will restrict ourselves to the case of s < polylog n since it is the case that we will care about for Iceberg hashing. 


Linked lists in each bin. In this subsection, we will assume there are O(n/s) bins and that the contents 
of each bin are stored contiguously in an array (note that this is not quite true for Iceberg hashing, since 
Iceberg hashing stores some elements in a backyard, but we will handle this issue later). Within each bin b, we 
maintain s linked lists Lı (b), La(b), ... La(b), where Le(b) consists of the records whose next address change 
will occur on the £th partial expansion of either the current doubling or some future doubling. 

In more detail, for a record x in bin b, the value £ can be computed as follows. Suppose we are currently 
doubling from 2% to 2°+1 bins and that we have completed j partial expansions, so there are 2° + jE bins. If 


gi?) (x) > s + j, then z’s current bin assignment must be determined by gs” (x) for some i > 1. Then for all 


q € [1, i), we have gP > s + j and thus 


l= min g® (x). 4 
ane O j 


On the other hand, if g (a) < s+J, then z is in its final position for the current doubling. Suppose that «’s next 
promotion is during the 2” doubling, that is, a’ = argmin,,.,{ Pan (£) = HE Let i = argmin, {g\" (x) < s} 


13Tf no such a’ exists, then we can feel free to not place x in any linked list. 
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(or i = logs + 1 if no such i exists). In this case, 


L= min ge’ (2). (5) 


qE[l,i) 


We denote a record x’s choice of £ by g” (x), where j is the number of partial expansions we have performed 
so far in the 2° doubling. 


Why the linked lists help. When we are performing the jth partial expansion, we need only examine 
the linked list L;(b) for each bin b. Not all of the elements of L;(b) will necessarily move during the partial 
expansion (some of them will move during future doublings), but all of the elements that we wish to move will 
be in a linked list L; (b) for some b. The next lemma bounds the total number of elements that are examined 
during a partial expansion. 


Lemma 10. Letn be the number of records, let k be the number of bins, suppose s < polylogn, and let j € [s]. 


Then w.s.h.p. inn, 
k 


> 1L5(b)| = O(n/s). (6) 


b=1 


Proof. It suffices to bound the expected value of (6), since the lemma then follows by a Chernoff bound. Let 
L= Uš L;(b). There are two cases for an element x € L: 


e Case 1: x’s address changes during the jth partial expansion, meaning that x gets moved into the 
(s + j)th chunk. By Lemmal§] the expected number of records x in this case is O(n/s). 


e Case 2: x’s address does not change again a future 2% doubling, a’ > a. That is, a’ is the smallest a’ > a 


such that Pa (x) = 1 (or, equivalently, gi? (x) > s). In this case, the probability that x € L is at most 


Pret? (x) = £| g(a) > sl]. 


Since Prig? (x) > s| = 1/2, the above probability is at most 


IPA ie) = 4). 


On the other hand, by Lemmal8] each record has a O(1/s) chance that ge) (x) = £. Thus, the expected 
number of records in this case is O(n/s). 


O 


Maintaining the lists. When maintaining the linked lists, there are two concerns: the space consumed by 
the linked lists, and the time needed to update the linked lists per hash table operation. 

Because each linked list is confined to a single bin, it can be implemented using pointers consisting of 
O(log h) bits (recall that each bin has capacity O(h)). Assuming that h < polylog n, the linked lists introduce 
at most O(log h) = O(log log n) bits of space overhead per key. 

The larger issue is how to compute a” (x) in constant time for a given record x. Here, we make use of the 
following remarkable fact. 


Lemma 11. Let 51, 5,,...8, be b-bit numbers packed into word S so that the numbers and padding bits take 
no more that \/w/3 bits. Then minë; S; can be computed in constant time. 


Proof. The idea behind this proof is to construct two words A and D, where A contains k copies of s1,..., Sk 
and D consists of k copies of sı followed by k copies of sọ and so on. Then O(1) word operations on A and D 
can be used to compared every pair of s;, sj, yielding a comparison indicator word FE from which we compute 
the minimum. 
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Let A be a word containing k copies of s1,..., Sp, with a copy of the sequence appearing every 3(b + 1)k 
bits. Let B be a word containing k copies 51,..., Sk, with a copy of the sequence appearing every 3(b+ 1)(k—1) 
bits. Mask out all but the numbers stored at multiples of 3(b + 1)k from B and call this C. So C has sı 
right justified in the last 3(b + 1)k bits, s2 in the preceding 3(b + 1)k bits, etc. Let D consist of k copies 
of C, with each copy shifted left by b + 1 positions. Now D consists of k copies of sı, then k copies of S2, 
etc. Compare D with A to perform an all pairwise comparison between the s; (recall that we discuss how to 
perform comparisons of packed machine words in Section[3). 

Let E be the resulting comparison-indicator word. Consecutive indicator bits are always separated by b 
bits, and we set these b bit separations to consist of all 1s. Now we are looking for a run of (b + 1)k 1s ina row, 
indicating that some s; is no greater than all the other s;s. We find this by adding one to the least significant 
position of each putative run, to see if the summation carries along the entire length of the run. We then 
identify the first such run by masking out all but the potential “carry” bits (one bit after each potential run) 
and computing the minimum s; from the position of the least significant carry bit. O 


We now consider the task of computing g” (x). Assuming that we have a’ and i, then (4) and (5) can be 
evaluated in constant time using Lemma [I| The value of a’ in (5) can be found in constant time by using 
standard bit tricks on the promotion sequence. The value of i (in either (4) or (5)) can then be found using 
Lemma[9] Thus we can obtain (x) in constant time. 

Putting the pieces together, we arrive at the following theorem. 


Theorem 4. Let n be the number of records. Assume h = Q(s) andh < polylogn, where O(h) is the 
maximum bin size. Then, the linked lists L;(b) can be maintained in constant time per operation and induce 
at most O(log h) bits of overhead per key. Moreover, w.s.h.p. inn, the set of records that move during the next 
partial expansion can be identified in time O(n/s). 


We remark that partial contractions (that is, when a chunk is removed rather than added) are much 
simpler than partial expansions because the set of records that must be moved is readily apparent (they are 
the records in the chunk being removed). 


4.3 Implementing (Truncated) Waterfall Addressing in an Iceberg Hash Table 


In this subsection, we describe how to implement waterfall addressing in an Iceberg hash table in order to 
achieve efficient dynamic resizing. 

Because the backyard in an Iceberg hash table is so small, it can be maintained using any (deamortized) 
resizing scheme. Thus our focus will be on resizing the number of bins in the front yard of the table. We use 
truncated waterfall addressing with partial expansions (and contractions) to resize the table. 

We use the standard Allocate Free Model of memory [34]. If we are performing s partial expansions per dou- 
bling, then the total number of memory allocations for a table of size n is O(s logn) < polylogn. We will assume 
that we have a large enough cache that pointers to the allocated memory chunks can be cached at all times. 


The main challenge: Maintaining the Iceberg analysis. Recall from the analysis of (static-size) Iceberg 
hashing that the Iceberg Lemma is used to upper bound the number of items in the backyard. When we move 
items, if we are not careful, we may end up pushing extra items into the backyard and arriving at a state 
that cannot be analyzed by the Iceberg Lemma. Thus our use of Waterfall addressing in Iceberg hashing, and 
its deamortization, ends up with some complications in order to guarantee something fairly straightforward: 
that the state of the system (including who is in the front yard and who is the backyard) is consistent with an 
instantaneous expansion or contraction that can be analyzed by the Iceberg Lemma. 

More specifically, the issue that we must be careful about is the following. Whenever we move a record r 
into a new bin b during an expansion or contraction, one can think of that move as representing a new insertion 
into the bin b. But the timing of the insertion will be dependent on the bin number b (and on where the record 
was before the move), which means we cannot simply analyze the insertion as being into a random bin. That 
is, we must analyze records b that are moving around due to a partial expansion or contraction differently 
than we would treat records that are being inserted by the user. 


Implementing partial expansions. Suppose we are adding a new chunk C, and let to be the time at which 
we begin the partial expansion. 
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At time to, we allocate memory to the chunk C. For each record z, let bingiq(a) denote the bin that x would be 
assigned to without chunk C and let binyew(x) denote the bin that z would be assigned to with chunk C present 
(i.e. after the partial expansion). For now let us assume that, until the partial expansion is complete, queries 
will treat the chunk C as being semi-present, meaning that a query for a record x will check both bingiq(a) and 
binnew(z). Of course, most records x will satisfy binoja (£) = binnew(x), in which case the query is unaffected. 

Once C has been allocated, the partial expansion is performed in three parts. 


e The Preprocessing Phase: In this phase, we construct a new counter in each bin that we call the 
demand counter. The demand counter in bin b keeps track of how many records x in the table 
(including in the backyard and in other bins) satisfy either binoja (£) = b or binnew(x) = b. (Importantly, 
this means that a single record could contribute to two different demand counters.) 


Later we will describe how to deamortize the phase. As the phase is performed, any concurrent 
operations also update the demand counters of the bins that they modify. 


e The Time Freeze: Let tı be the moment in time immediately after the Preprocessing Phase completes. 
We refer to tı as the time freeze point. Roughly speaking, we will try to simulate the partial expansion 
as having occurred instantaneously at time t1. 


At time tı, every bin reserves some of its slots for records that are currently in the table|!4 If a bin has 
demand counter d, and the capacity of the bin isr = h+7;,, then the bin reserves min(d, r) slots for records 
currently in the table. Any records that are currently in the bin are immediately given reserved slots. 


e The Reshufling Phase: Call a record grandfathered if it was in the table at time tı and has 
remained in the table since. The reshuffling phase identifies which records in the table (including both 
in the first and backyards) are grandfathered}, and attempts to move each grandfathered record x to 
a reserved slot in binnew(x). If there is a free reserved slot in binnew(2), then x is given that slot, and 
otherwise x is sent (possibly back) to the backyard. If x is being moved from binoiq(a) in which it was 
taking up a reserved slot, then the number of reserved slots in that bin is decremented by 1 (because x 
is no longer present in that bin). 


Later we will describe how to deamortize the phase. During the phase, concurrent operations may take 
place. If a grandfathered record x is deleted, then for each of the bins b € {binoga (£), binnew(x)}, if z was 
either residing in bin b or if there is a free reserved slot in bin b (think of this slot as being reserved for x), 
then the operation that removes x also decrements the number of reserved slots in bin b. If a new record 
x is inserted during the phase (note that x is therefore not grandfathered), and the only free slots in the 
binnew(z) are reserved, then x is sent to the backyard despite there being free slots in the binnew(2). 


Once the Reshuffling Phase is complete, the partial expansion is also complete. Call this time t2. 


One minor technical issue that we must be careful about during the Reshuffling Phase is that Iceberg 
hashing requires that no two records in a given bin have the same fingerprint. Thus, when placing a 
grandfathered record into a free reserved slot in a bin, we must handle the following additional two cases: if 
there is another grandfathered record in the bin with the same fingerprint as x, then x is sent to the backyard 
and the number of reserved slots in the bin is decremented by 1 (i.e., the reserved slot given to x is removed); 
if there is another non-grandfathered record y in the bin such that y has the same fingerprint as x, then y is 
sent to the backyard and x is given the reserved slot. 

Recall that each bin must keep a floating counter that tracks the number of items that hash to the bin 
but are in the backyard. During the Prepossessing and Reshuffling phases, we must be careful to keep the 
floating counters in consistent states, as follows. During the Preprocessing Phase, the floating counters for 
each bin b € C are initialized to be the number of records x in the backyard such that binnew(x) = b. Then, 
during the Reshuffling Phase, whenever the traversal visits a grandfathered record x in the backyard such that 
binnew(a) € C, the floating counter for bingia(x) is decremented (in essence, binyew(x) is now declared to be 
responsible for record x, even if z remains in the backyard). 


14These reservations are performed logically at tı but do not require any physical action at t1. 


15Since space efficiency in the backyard is not important, we can simply have separate tables for the grandfathered and 
non-grandfathered records. 
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Putting the pieces together. As described above, the purpose of the three phases in each partial expansion 
is to simulate the expansion as having occurred at a single point in time tı. In Appendix |B] we prove that 
partial expansions (implemented in this way) do not interfere with any of the properties of Iceberg hashing (i.e., 
the backyard remains small, and elements individually have good probability of being in the front yard). The 
appendix also describes how to carefully implement the partial expansion such that it is deamortized and I/O 
efficient, and describes how to analogously handle partial contractions. The result is the following theorem: 


Theorem 5. Consider a dynamic Iceberg hash table with average-bin-fill parameter h. Suppose that the 
number n of elements stays in the range such that logn/loglogn > Q(h). Suppose that the table used in the 
backyard supports constant-time operations (w.h.p. in n), has load factor at least 1/ poly(h), and is stable. 
Finally, set the resize granularity s = vh. 

Consider an operation on a key x. The following guarantees hold. 


e Time Efficiency. The operation on x takes constant time in the RAM model, w.h.p. inn. 


e Cache Efficiency. Consider the EM model using a cache line of size B = O(h) and a cache of size 
M > ch!°B+slogn for some sufficiently large constant c, and suppose that each bin in the front yard 
is memory aligned, that is, each bin is stored in a single cache line. Finally, suppose that the description 
bits of the hash functions are cached. Then the expected number of cache misses incurred by the operation 
on x is 1 + O(1/VB) = 1+ 0(1). 


e Space Efficiency. The total space in machine words consumed by the table, w.h.p. in n, is 


(.0( 2E 


)) n= (1+0(1))n. 


e Stability. If a partial resize has not been triggered in the past O(n) operations, then the table is stable. 


Remark 1. In the case where the cache line size is B = O(h), the guarantees in Theorem|[3] come close to 
matching the best known bounds for external-memory hashing [26]. In particular, [26 achieves load factor 
1 — O(1/VB) with an average of 1 + O(1/WB) cache misses per operation. 


Remark 2. Theorem[5] requires h < O(log n/loglogn). Note, however, that whenever logn changes by more 
than a factor of two, we can simply rebuild our table (with a new parameter h of our choice). These rebuilds 
can be performed space efficiently and are rare enough that they do not hurt the expected cache behavior of 
operations. In this sense, the assumption that log n/loglogn > Q(h) is without loss of generality. 

In more detail, the rebuilds can be implemented as follows. Break the table’s lifetime into doubling 
windows, consisting of time windows in which the table’s size either doubles or halves. Then place the 
doubling windows into window runs, where each window run is determined as follows: if at the beginning of 
the window run the table size is n, then the window run lasts for a random number k € [1,logn/2] of doubling 
windows, after which the next window run begins. During the final window of each window run, we rebuild the 
hash table from scratch using the then appropriate value of h. 

Each rebuild can be performed space efficiently by storing both the new and old versions of the hash table 
as dynamically resized Iceberg hash tables during the rebuild. During a given rebuild, operations may incur 
multiple cache misses, but the probability of a given operation being contained in a window where a rebuild 
occurs is at most O(1/logn) (where n is the current table size), so the expected number of cache misses per 
operation remains 1 + O(1/Vh). 


5 Super Space-Efficient Iceberg Hashing 
In this section we consider the problem of further optimizing the space efficiency of Iceberg hashing. We 
achieve a load factor of 1 — O(loglogn/logn). This improves on the previous best known bound [34] of 


1— O(1/Vlogn). In the case where keys and values have O(log n) bits, our table wastes only O(log log n) bits 
per key (in comparison with the previous state of the art of O(./log n) bits per key). 
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So far, we have been limited by the fact that the routing table in each bin can only support O(log n/ log log n) 
records. This, in turn, has limited the average-bin-fill parameter h to O(log n/ log log n) and has limited our 
best achievable load factor to 1 — O(/log log n/,/log n). 


Supporting large bin sizes. Throughout the rest of the section, we consider average-bin-fill parameters h 


such that 
logn log? n 
log log n’ log log n | ` 


Rather than having a single routing table per bin, we now have 


fee h 
~ | logn/loglogn 


routing tables Ri, R2,..., Rx per bin, each of which can route up to 2 log n/ log log n fingerprints. 

Each key x selects a routing table using a new hash function r : U — [k] where U is the universe of keys. 
Operations on key x use routing table R,,) in bin(«). 

Although the key x hashes to a specific routing table R,(,), the key can still be placed anywhere within 
the bin(x). That is, the routing table R,, maps fingerprints to arbitrary positions in [k + Tp]. 

The assignment of keys to routing tables introduces a problem: some routing tables R; may be assigned more 
than 2 log n/ log log n keys to route. When this happens, the routing table sends overflow keys to the backyard 
of the Iceberg hash table. Keys sent to the backyard by an overflowed routing table are called routing floaters. 

We once again use the Iceberg Lemma, which tells us that, w.s.h.p., there are very few total routing floaters. 


Lemma 12. Leth € [logn/loglogn, polylog(n)], and define N as in Section[3 There are O(N/ polylog(n)) 
routing floaters in the table, w.s.h.p. in N. Moreover, for a given key x, the probability that there is a routing 
floater y such that (bin(y), r(y)) = (bin(x), r(a)) is at most O(1/ poly(h)). 


Proof. The proof follows exactly as for Lemma|[4] except that now the “bins” in the Iceberg Lemma are the 
routing tables rather than the actual bins in the Iceberg hash table. Oo 


Note that the maximum capacity per routing table of 2logn/loglogn is much larger than necessary for 
the analysis, since a capacity of log n/ log log n + Tog n/ log logn Would suffice for the proof of Lemma] We are 
able to apply this much slack to the routing tables because they are a low-order term in the space consumption 
of the Iceberg hash table. 


Changes to the metadata. To accommodate the large value of h, the bookkeeping in each bin also changes 
slightly. Each routing table maintains its own floating counter, and queries on a record x need only go to the 
backyard if the floating counter for R,(,) in bin(#) has a non-zero floating counter. Additionally, since h may 
be much larger than logn, we can no longer keep track of the free slots in the bin with a bitmap. Thus the 
vacancy bitmap is replaced with a free list, which is a linked list of the free slots in the bin. 


The non-resizing case. We can now extend TheoremP]to hold for h < log? N/ loglog N. 


Theorem 6. Consider an Iceberg hash table that never contains more than N elements and suppose that 
the average-bin-fill parameter satisfies h = O(log? N/loglog N). Suppose that the backyard table supports 
constant-time operations (w.h.p. inn), supports load factor at least 1/ poly(h), and is stable. 

Consider a sequence of operations in which the number of records in the table never exceeds N, and consider 
a query, insert, or delete that is performed on some key x. Then, the following guarantees hold. 


e Time Efficiency. The operation on x takes constant time in the RAM model, w.h.p. in N. 


e Cache Efficiency. Consider the EM model using a cache line of size B = O(h) and a cache of size 
M = Q(B), and suppose that each bin in the front yard is memory aligned, that is, each bin is stored in 
a single cache line. Finally, suppose that the description bits of the hash functions are cached. Then the 
operation on x has probability at least 1 — 1/ poly(B) of incurring only a single cache miss. 
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e Space Efficiency. The total space in machine words consumed by the table, w.h.p. in N, is 


(0 0( 


)) N = (1 + o(1))N. 
e Stability. The hash table is stable. 


Proof. The proof is the same as for Theorem [2] except with two changes for the case of 
h € [log N/ log log N, O(log? N/ log log N)]. 

First, we must account for the space consumed by the routing tables R,,..., Rp in each bin. Fortunately, 
these tables consume only O(h log h) bits per bin, in comparison to the O(h log N) bits otherwise needed for 
the bin. Thus the routing tables only increase the total space consumption by a factor of at most 


log h log log N ylog h 
1+0 <1+0 | ——]<1+0 
- (t) - ( logN ] 7 vh J’ 


where the inequalities use that h € [log N/ log log N, O(log? N/ loglog N)]. 
Second, we must account for the presence of routing floaters in the backyard. This is handled by simply 
applying Lemma [2] o 


Corollary 1. In Theorem[6, when h = log? N/ loglog N, the total space in machine words consumed becomes 
log log N 
1+0 | ——]]N 
FA a )) 


Supporting dynamic resizing. One can support dynamic resizing using essentially the same approach 
as in Theorem [5] The process of performing a partial expansion or contraction must be slightly modified 
to accommodate the routing-table structure of each bin, however. In particular, we now maintain demand 
counters dp; for each routing table R; in each bin b. When a time freeze occurs, the bin reserves 


k 
min ( 2 min(dy i, 2logn/ log log ») „h+ n) (7) 


i=l 


slots for records currently in the bin. That is, the bin reserves dp ; slots per routing table, subject to the capacity 
constraints of the routing tables and the bin. If (7) is h + Th, then the bin can determine arbitrarily how many 
slots are reserved for each routing table, as long as the i-th routing table has at most min(dy,;, 2 log n/ log log n) 
slots reserved and the total number of reserved slots is h + mE 

The proofs of lemmas analogous to Lemma [7] Lemma [8] and Lemma [9] follow exactly as in Section 
[4.3] and eaters except that now routing floaters are accounted for in addition to capacity floaters and 
fingerprint floaters|! 

Putting the pieces together, we can extend Theorem[]to support larger values of h. 


Theorem 7. Consider a dynamic Iceberg hash table with average-bin-fill parameter h. Suppose that the 
number n of elements stays in the range such that h = O(log? n/loglogn). Suppose that the table used in 
the backyard supports constant-time operations (w.h.p. inn), has load factor at least 1/ poly(h), and is stable. 
Finally, set the resize granularity s = vh. 

Consider an operation on a key x. The following guarantees hold. 


16To simplify accounting, the number of slots reserved for each routing table can be determined lazily during the Reshuffling 
Phase. That is, only when the routing table R; is next accessed, do we decide how many slots were reserved for it at the time freeze. 

17Since there are now multiple demand counters per bin, we must be careful that the time (in the RAM model) to perform a 
partial expansion or contraction is still O(n/s), where s is the resize granularity used by waterfall addressing. Fortunately, the 
current values for the demand counter of each routing table, and the value of oy min(dp,;, 2 log n/ log log n) for each bin b, are 
straightforward to keep track of at all times (rather than just during the Preprocessing Phase) while adding only O(1) overhead 
per operation. Thus, in the case of a partial expansion, the Preprocessing Phase needs only to instantiate these values in the new 
bins, and in the case of a partial contraction, the Preprocessing Phase needs only update the values appropriately to take account 
of the O(n/s) records that are being relocated. 
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e Time Efficiency. The operation on x takes constant time in the RAM model, w.h.p. inn. 


e Cache Efficiency. Consider the EM model using a cache line of size B = O(h) and a cache of size 
M > ch!°B+slogn for some sufficiently large constant c, and suppose that each bin in the front yard 
is memory aligned, that is, each bin is stored in a single cache line. Then the expected number of cache 
misses incurred by the operation on x is 1 + O(1/VB) = 1 + o(1). 


e Space Efficiency. The total space in machine words consumed by the table, w.h.p in n, is 


(1+0(S8*)) n= a +o). 


Stability. If a partial resize has not been triggered in the past O(n) operations, then the table is stable. 


Corollary 2. In Theorem[] when h = log? N/ loglog N, the total space in machine words consumed becomes 
log log N 
1+ 0 | —— ]]N 
( t ( log N )) 


6 Achieving Subpolynomial Failure Probabilities 


In this section, we consider the problem of achieving subpolynomial probabilities of failure for Iceberg hashing 
assuming access to fully random hash functions (as in past work, [21/22], the assumption of fully random hash 
functions is needed to avoid failure probability that is introduced by the hash functions themselves). 

We begin by stating a version of Theorem[Y]assuming fully random hash functions. The theorem follows 
immediately from the w.s.h.p. guarantees offered by the lemmas in the previous sections. 


Theorem 8 (Theorem [7] with super-high probability). In the conditions of Theorem [7 suppose that the 
backyard table T supports each operation in constant time with probability 1 — p(n). Then, assuming fully 
random hash functions, the guarantees of the Iceberg hash table hold with probability 1— O(p(n) +2-”/ Polylos(™) ) 
per operation. 


We now consider the problem of designing a backyard hash table 7 that has a super small failure probability 
p (the same failure probability can then be achieved by Iceberg hashing, using Theorem[8). By employing 
the very-high probability hash table of Goodrich, Hirschberg, Mitzenmacher, and Thaler [22] one can achieve 
p = 27 P°lylog(n). Tn this section, we show how to do significantly better when each key is O(log n) bits. For 
this case, we are able to achieve p = o(2-""*) for a positive constant £ of our choice. 

Throughout the rest of this section, set 6 = ¢/4, so we are aiming for p = oi" ), and set the machine 
word size w = O(log n). 


The difficulty of subpolynomial guarantees: not enough random bits. The main difficulty that one 
encounters when trying to achieve a failure probability p that is subpolynomial is that hash collisions must 
be treated as the common case. That is, since any two keys have a 1/ poly(n) chance of colliding (on any 
w = O(log n)-bit hash function), we must be able to handle a superconstant number of keys colliding on their 
hash functions. If we want p = o-"") then we must be willing to tolerate Q(n!~*°/ logn) keys colliding 


with one another. 


Storing n'~?° keys deterministically. In order to store a small set of n!~?° keys deterministically, we will 


make use of a radix trie with fanout n°. We formalize the properties that we will need from the radix trie in 
the following lemma. 


Lemma 13. Suppose keys are w = O(log n) bits and let 6 > 0 be a constant. There exists a deterministic data 
structure that can be initialized in time o(n), that consumes space o(n), and that supports insertions, deletions, 
and queries in constant time on a set of up to O(n'~?°) keys. 
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Proof. As noted above, the data structure is a radix trie with fanout n°. The root node r of the trie is an array of 
length n°. The ith entry in r is null if there are no keys x whose first ô log n bits equal i. Otherwise, the ith entry 
points to a recursively-defined trie storing the final w — 6 log n bits of each key x whose first ô log n bits equals i. 
The trie has depth O(1/6) = O(1). Since the data structure stores O(n!~?°) keys, the trie can have at 
most O(n!~?°) nodes. The total space consumption is therefore O(n!~°*) since each node consumes n° space. 
The data structure requires O(n!~°) time to initialize, where the initialization time is spent allocating 
O(n1~?°) arrays that each consist of n? null pointers. These arrays can then be used to implement operations 
on the trie in constant time. oO 


ô 


Storing all but O(n!~°) keys in bins. We now describe a hash table with failure probability p = O(2-""”). 
Because we are constructing a hash table to be used as a backyard, and thus we are not concerned about space 
efficiency (a load factor of O(1) is okay), we can ignore the issue of dynamic resizing (which can be performed 
with deamortized rebuilds) and the issue of deletions (which can be performed by marking elements as deleted 
and then rebuilding the data structure every O(n) operations). Thus, we can assume there are O(n) records 
and that the only operations are queries and insertions. 

We maintain n/ logn bins, each with capacity O(log n). Queries and insertions are implemented in each bin 
using the dynamic fusion tree of Pătraşcu and Thorup [42], which supports constant time deterministic opera- 
tions on a set of size polylog n. If a bin overflows (that is, there are more than clog n records for some large con- 
stant c) then the overflow records are stored in the data structure from Lemma[I3] Call these records stragglers. 


Lemma 14. With probability 1 — o=") there are O(n'—?°) stragglers at any given moment. 


Proof. The fact that we need only consider insertions allows for the following analysis. The expected number 
of stragglers is o(1) since each bin has a 1/ poly(n) probability of overflowing. On the other hand, the number 
of stragglers is a function of O(n) independent random variables (i.e., the bin choice for each ball that is 
present), and each of these random variables can only affect the number of stragglers by +1. Thus we can 
apply McDiarmid’s inequality (see Theorem[I) to obtain a concentration bound on the number of stragglers. 
This implies that there are O(n!~?°) stragglers with probability at least 1 — O(2-”" ”). oO 


Putting the pieces together, and using 6 = ¢/4, we arrive at the following theorem. 


Theorem 9. Consider keys that are O(logn) bits and lete > 0 be a constant. There is a hash table (using 
fully random hash functions) that supports constant time operations and constant load factor with failure 


probability 1 — Oe) per operation. 
Proof. This follows from Lemmas [3]and [4] o 


Corollary 3. Consider keys that are O(logn) bits and lete > 0 be a constant. Iceberg hashing with fully 
random hash functions can be implemented with failure probability 1 — oQ-”*) per operation. 


7 Succinctness Through Quotienting 


So far, we have focused on designing an explicit data structure, that is, a space-efficient data structure that 
explicitly stores each key-value pair somewhere in memory. Such a data structure does not achieve the 
information-theoretic optimum memory consumption, however. Given a set of n keys from a universe U, the 
minimum number of bits needed to encode the set is 


|U] 
log ( a 
UI 


which by Stirling’s approximation is n log =~ — O(n). In this section, we give a succinct version of the dynamic 
Iceberg hash table that, assuming that |U| < poly n, stores n keys using space 


lU] 


n log F + O(nlog log n) 


27 


bits. The table can also support v-bit values for each key using an additional v bits of space per key. 


Using quotients to save space. Although we will remove the assumption later, for now let us assume that 
our keys are selected at random from the universe U. This means that the master hash m(x) of each key can 
simply use the low-order bits of the key x. These bits, in turn, do not need to be explicitly stored in the hash 
table. (This space-saving technique is often called quotienting). 


Storing some keys with fewer bits than others. Recall that only part of each key’s address is determined 
by its master hash, and that, at any given moment, different keys may use different numbers of bits from their 
master hash. All of the keys within a given bin use the same number of bits from their master hashes, however. 
In particular, the keys in bins whose indices are in the range I; = (s2*~1, s2] all use i bits from their master 
hash. Thus, we can implement the bins in J; to only explicitly store log |U| — i bits of each key, with the rest 
of the bits for the key being stored implicitly by quotienting. 

A consequence of this design is that some bins are more space efficient than others. If there are m bins, 
then the most space efficient bins use R = log |U| — log m + log s bits per key (ignoring space used for metadata 
and empty slots) and the least space efficient bins use log |U| bits per key. The fraction of bins that use R + i 
bits per key is O(1/2*). Thus, the total number of bits wasted by not storing exactly R bits per key is 


O Dzi < O(n). 


i21 
Critically, the fact that some keys save more bits than others only affects our space consumption by O(n) bits. 


Analyzing the total space consumption of the hash table. The use of quotients in place of the master 
hash function complicates several aspects of the analysis of Iceberg hashing. Before discussing these aspects, 
however, let us analyze the space consumption of the hash table, assuming the standard analysis of Iceberg 
hashing. Throughout the rest of the section we set h to be O(log? n/logn), which maximizes the space 
efficiency of the data structure. 

The number of bits used to store keys (in the front yard) of the table is 


nR + O(n) = log & + O(n log log n) (8) 


bits. As shown in Theorem [5] the space consumed by the backyard table is O(n/logn) bits, the space 
consumed by meta-data is O(nloglogn) bits, and the empty slots in front-yard bins induce at most a 
1+ O(log log n/logn) multiplicative overhead on the space needed to store the keys (that is, on (8)). Putting 
the pieces together, the total space consumption is 


log (7) + O(n log log n) 


bits. 


Handling lack of independence in the master hash function. We now turn our attention to a subtle com- 
plication that arises in analyzing Iceberg hash tables that use quotienting. Because the keys are assumed to be 
random distinct elements from a universe U, the master hashes (and thus the bin assignments) are not indepen- 
dent. In particular, the distinctness assumption introduces (negative) correlation between the bin assignments 
of keys. Since the bin assignments are no longer independent, we can no longer directly apply the Iceberg lemma. 

Let K C U be the set of keys that are ever placed into the hash table. In general, K could contain all 
of U. At the cost of making each key O(log log n) bits longer, we can assume without loss of generality that 
|K| < |U|/ polylogn for a polylogarithmic factor of our choice. Call this the sparsity property. 


Let yi,---,Y|K, be independently selected random elements of U. For the sake of analysis, we can 
treat the elements 7,...,2)« , of K as being constructed via the following process: for i = 1,...,|K], if 
Yi $ {41,...,Uj-1}, then set x; = y; and otherwise select x; at random from U \ {x1,...,2;-1}. Say that the 


key x; is dangerous if x; 4 yi. 
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When a key z is inserted, say that x is vicariously dangerous if either x is dangerous or there is another 
key y that is present and maps to the same bin as does x. In order to analyze the backyard of Iceberg hashing 
in the context of random keys (whose quotients are used as master hashes), it suffices to show that, at any 
given moment, the number of vicariously dangerous keys is n/ polylogn. All other keys can be analyzed as 
though the master hashes were determined by y1,...,yj«| (which are independent). 


Lemma 15. Consider a moment in which there are n keys in the table. Then w.s.h.p. in n, the number of 
dangerous keys present is n/ polylog n (for a polylogarithmic factor of our choice). 


Proof. The probability that x; is dangerous is exactly (i — 1)/|U|, and the property of being dangerous is 
independent between keys x;. By the sparsity property, the probability (i — 1)/|U]| is at most 1/ polylog n for 
all keys x;. The lemma therefore follows by a Chernoff bound. 

oO 


Lemma 16. Consider a moment in which there are n keys in the table. Then w.s.h.p. inn, the number of 
vicariously dangerous keys present is n/ polylogn (for a polylogarithmic factor of our choice). 


Proof. Let A = {a1,...,@,} be the keys present in the table. Let Y1,..., Yn be such that Y; is the set of keys 
present at the time of a;’s insertion. Let Z; = Y; \ A. 
There are three ways that a; can be vicariously dangerous: 


e The first case is that a; itself is dangerous. Lemma [15] tells us that w.s.h.p. in n there are at most 
n/ polylog n dangerous keys a;. 


e The second case is that there is an element y € Z; such that y is dangerous and a; and y map to the same 
bin. By Lemma [l5] (w.s.h.p./ in n), the number of dangerous keys in Z; is n/ polylogn. This, in turn, 
means that at most a 1/ polylog n fraction of bins contain a dangerous key from Z;. The probability of a; 
mapping to the same bin as such a key (and not itself being dangerous) is at most 1/ polylogn. Since the 
Z;’s are disjoint (and ignoring the keys a; that are dangerous), these probabilities are independent across 
keys a;. By a Chernoff bound, w.s.h.p. in n, the number of keys in this case (that are not dangerous) is 
n/ polylog n. 


e The third case is that there is an element y € A \ {a;} such that y is dangerous and a; and y map to the 
same bin. By Lemma[15] the number of dangerous keys in A is at most n/ polylog n w.s.h.p.Conditioning 
on this, each key independently has at most a 1/ polylogn probability of being in this third case (and 
not being dangerous). By a Chernoff bound, w.s.h.p. in n, the number of keys in this case (that are not 
dangerous) is n/ polylog n. 


Combining the cases completes the proof of the lemma. 
Oo 


By Lemma [16] the fact that master hashes are determined by 21,..., 2), (which are not independent) 
instead of y1,..., yıx] (which are independent) only affects the size of the backyard of the hash table by 
n/ polylog n (because of vicariously dangerous records behaving differently in the two cases). 


Simulating random keys with almost random permutations. In order to simulate random keys, anatural 
approach is to apply a random permutation to the universe U. Constructing an efficiently describable random 
permutation that can be evaluated in constant time remains a significant open question. Fortunately, there do 
exist efficient k-wise 6-dependent permutations [35/37], that is, permutations drawn from a distribution that 
is 6-close to being k-wise independent. In more detail, there exists some constant œ > 0 such that for k = n® 
and 6 = 1/ polyn, there is a k-wise -dependent family of permutations whose members can be evaluated in 
constant time and described using nê bits for some 8 < 1. In particular, one can achieve 6 = 1/2°'°8”™ using 
Corollary 8.1 of (along with the hash family of [38] for f1, f2), and then, as shown by [27], 6 can be amplified 
to 1/ poly n by composing together O(1) independently selected permutations that each satisfy 6 = 1/2°(°8”), 

Because 6 = 1/ poly n, the fraction of the time that the hash family does not behave as k-wise independent 
can be easily absorbed into the failure probability of Iceberg hashing (assuming we are only proving a w-h.p. 
guarantee). On the other hand, n%-wise independence does not obviously suffice for our analysis of Iceberg 
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hashing. Essentially the same problem was encountered previously in [4], and their solution also works here. 
For completeness we describe the solution below. 

Let N be a parameter. As in Appendix [A] (where we discuss how to construct explicit families of 
hash functions for Iceberg hashing), we break our table into N‘~* subtables for some e sufficiently smaller 
than a. We will guarantee that the subtables are all the same sizes as each other, up to negligible terms, 
which means that they can be resized synchronously with each other; this, in turn, means that each partial 
expansion/shrinkage can be implemented in O(1) memory allocations, which allows for us to directly access 
all of the subtables without any extra layers of indirection. That is, the act of decomposing the hash table 
into N!~£ subtables does not hurt the cache-efficiency of our data structure. 

We may assume without loss of generality that the size of the table stays in the range [N/2, N] for some N, 
since every time the size of the table changes by a constant factor, we can rebuild the table (in a deamortized 
fashion) to accommodate the new value of N. Note that such a rebuild does not violate succinctness because, as 
we move elements from the old version of the table to the new version, the partial shrinkages that occur in the old 
subtables will keep them succinct until they get to small enough sizes that their space consumption is negligible. 

Keys x are mapped to a subtable by performing a permutation 71(x) and then using the least significant 
(1 —«) log N bits as a subtable choice. Let x’ denote the most significant log U — log N + e log N bits of mı (x). 
Rather than storing x in the subtable, it suffices to store x’. And rather than storing x’ in the subtable, we 
instead perform a second permutation 72(x’) to obtain the actual key that we store in the subtable. 

The second permutation 72 can be implemented as a N°-wise (1/ poly n)-dependent permutation. As in 
Section[A] the small size of the subtable ensures that N°-independence suffices. The more difficult challenge 
is implementing 7 so that, with high probability, each of the subtables receive N€ + O(N(2/9)*) keys. 

Arbitman et al. [4] give an elegant solution to this problem by defining 7 using a single-round Feistel permu- 
tation. Define the right part xp of akey x to be the least significant (1—e) log N bits of x and define the left part 
xy to be the remaining bits. Let H be the family of hash functions given by Pagh and Pagh parameterized to 
simulate k-independence for k = N/ log? N and so that each h € H maps the left part £z of a key to an output of 
(1—«) log N bits (which is the same number of bits in the right part xz of the key). The guarantee given by 
is that for a random h € H, and for any given set S of size O(N/ log? N), the function h € H acts fully randomly 
on S$ with high probability in N; moreover, each hash function h € H can be represented with O(N/ log N) 
description bits and can be evaluated in constant time. Using a random h € H, the permutation 7} (x) is defined 
by h(x,) @ x where © denotes the XOR operator. Note that 7, changes only the least significant (1 — £) log N 
bits of x, meaning that x, does not change. Thus, even though the function h may not be invertible, the function 
mı is invertible (and, in fact, mı = 7, '). This ensures that 7 is a permutation. On the other hand, as shown 
by Arbitman et al. [4] (see their Claim 5.7), the randomness from h is sufficient to ensure that 71 distributes 
keys evenly among the subtables, that is, every subtable has N€ + O(N(?/9)*) keys with high probability in N. 

We remark that, since mı preserves xy, the input to 72 is actually just xz. Thus, the subtable is selected 
by h(a_) ® zr and then the key 72(xz) is stored in the subtable. 

We also remark that, although the permutations 7 and 72 are used to randomize the key (and thus deter- 
mine the master hash), the chunk hash functions { gs} used by waterfall addressing must be generated through 
a separate process, and should thus be implemented using the hash-function construction given in Appendix[A] 


Putting the pieces together. To conclude the section, we give a theorem summarizing the guarantees of a 
quotiented Iceberg hash table. 


Theorem 10 (Theorem[5] with Quotienting). Consider a dynamic quotiented Iceberg hash table. Let n be the 
current number of keys, and suppose |U| < polyn. Then the table consumes 


log (o + O(nloglogn) 
n 


bits and supports operations which run in constant time with high probability in n. Additionally, the stability 
and cache-efficiency guarantees that hold on the non-quotiented Iceberg hash table continue to hold for the 
quotiented Iceberg hash table (although, of course, due to quotienting, some bits of each key may be stored 
implicitly based on where the key resides). 


We remark that the quotiented Iceberg hash table can also easily be adapted to store an O(log n)-bit value 
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for each key. If values are j bits, then the table uses space 
|U] ; , 
log E + nj + O(nloglogn) 


bits. 


8 Other Related Work on Hash Tables 


In this section we summarize some of the milestones in past work on hash-table design. Although many of these 
works are also discussed earlier in the paper, we include a discussion of them all together here for completeness. 

The first hash table to achieve constant-time operations with high probability was that of Dietzfelbinger 
et al. [14] in 1990 (building on previous work by Fredman et al. [I9] and Dietzfelbinger et al. [[3]). Subse- 
quently, Pagh and Rodler [40] introduced a much simpler hash table, namely Cuckoo hashing, that achieves 
constant-time queries but allows for insertions and deletions to sometimes take longer. By queuing the work 
to be performed in a Cuckoo hash table, and performing it incrementally, Arbitman et al. [3] showed how to 
make all operations in a Cuckoo hash table take constant time. 

A separate line of work has focused on optimizing space utilization. The first dynamically-resizable, succinct 
(i.e., the space consumption is comparable with the information theory lower bound) hash table was proposed by 
Raman and Rao [45] in 2003, but the insertion cost was only O(1) expected in the amortized sense. Demaine et 
al. [12] improved this to constant time in the worst case in exchange for a constant factor loss in space consump- 
tion. In 2010, Arbitman et al. [4] gave the first hash table to both be succinct and provide all worst-case costs (al- 
though it is not dynamically resizable). They used a front yard/backyard table, in which the backyard is imple- 
mented as a deamortized Cuckoo hash table, which naturally lends to a mechanism for controlling the occupancy 
of the backyard by moving records back to the front yard. Similar ideas were used by Bercea and Even to build 
hash tables for random multisets [7] and for multisets [8]. Recently, Liu et al. [34] presented a dictionary that, in 
addition to succinctness and worst-case costs, supports dynamic resizing. As in this paper, the results of [4| and 
are presented both in terms of hash tables with high load factors and in terms of succinct data structures. 

Research on external memory hashing has taken two avenues. The first is to allow for super-constant 
time queries in exchange for sub-constant (amortized) time inserts and deletes [9}/25)[50]. The second is to 
achieve 1 + o(1) cache misses per operation, for both queries, insertions, and deletes [26\/41]. Particularly 
interesting is the external memory hash table by Jensen and Pagh [26], which supports all operations in 
1 + o(1) expected amortized cache misses, uses (1 + o(1))n space, and is dynamically resizable, but does not 
achieve constant-time operations in the RAM model. Their hash table [26] makes use of cache-efficient resizing 
techniques that were previously developed by Larson [3I] for external-memory file storage (as discussed in 
Section[4] the same resizing techniques serve as a starting point in our design of waterfall addressing), which 
in turn extend previous work on the topic by Litwin [83]. 

A major open question is whether randomness is needed to achieve constant-time operations (see discussion 
in [4] as well as [24)(39}/42)[46]/49]). In the case where the hash table is very small, the dynamic fusion tree 
of Pătraşcu and Thorup achieves this goal, but for larger hash tables, the question remains open. This 
raises the simpler question of what the smallest-achievable failure probability is. Until this paper, the only 
known schemes to achieve subpolynomial probabilities were those of [21][22], resulting in a failure probability 
of 1/2P°lylosn, Whether these schemes are compatible with explicit families of hash functions (without 
amplifying the failure probability) remains an open question. 
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A An Explicit Family of Hash Functions for Iceberg Hashing 


In this section, we show how to implement Iceberg hashing using O(n“ log n) random bits for a positive 
constant a > 0 of our choice. As in past work [4][22)[34], the hash-function families that we use will introduce 
an additional 1/ poly n probability of failure, meaning that they cannot be used to offer anything better than 
w.h.p. guarantees. 


Reducing to the case where keys are O(log n) bits. In general, Iceberg hashing allows for keys as large as 
O(w) bits, where w is the machine word size. We can assume without loss of generality, however, that keys 
are O(log n) bits. In particular, prior to computing the hash of a key x, we can use pairwise-independent 
hashing to map z to an intermediate value x’ that is O(log n) bits, and then we can compute the hash of x’ 
rather than x. The intermediate values introduce a 1/ poly n probability of collision between pairs of keys, 
but this is easily absorbed into the failure probability of a hash table. 


Two families of hash functions. We will make use of two families of hash functions, both of which map a 
universe U of size polynomial in n to O(log n) bits. 

The first family Hı, which is due to Pagh and Pagh (see also related work by Dietzfelbinger and 
Woelfel [16]), offers the following guarantee for a randomly selected hash function g € H1: for any fixed set 


34 


S C U of size |S| = n®, with high probability in n, g is random on S. Moreover, each hash function g € Hı 
can be represented with O(n® log n) description bits and can be evaluated in constant time. 

The second family H2 uses tabulation hashing [44]. We will use H2 to map records to random “buckets” 
in the range [n+~*] for some small constant £ to be selected later. By using tabulation hashing with an 
appropriately small table-size parameter c, we can arrive at the following guarantee for a randomly selected 
hash function g € H2: for any set S of O(n) records, with high probability in n, the number of records from S$ 
that map to any given bucket is |S|/n!~* + n@/3)* (see Theorem 1 of [44]). Moreover, each function g € H2 
can be represented with O(n*) description bits and can be evaluated in constant time. 


Using Hı and 2 in Iceberg hashing. Let a > 0 be a small positive constant of our choice and let € > 0 be 
a sufficiently small positive constant relative to a. Let N be a parameter and consider a hash table whose size 
stays in the range |N fa" N ]. We remark that this size restriction is without loss of generality using the 
window rebuild technique described in Remark[P]in Section [4] 

We maintain k = N1—€ Iceberg hash tables T;,..., Tg, each of which is managed using a single hash 
function g drawn at random from H1. L°®| Keys are then mapped to a random table T;, where 1 is selected 
using a random hash function f from H2. 

The guarantee of H ensures that all of the tables T;,...,7j, have the same numbers of records assigned to 
them up to +N/°)©, which is a low order term for each table. As a consequence, we can dynamically resize 
all of the tables T),..., Tk in syne with one another. That is, when we perform a partial expansion or 
shrinkage on one of the tables, we perform it on all of them. This is important, as it eliminates the need to 
have k pointers pointing to different data structures, and allows us to store pointers to all of our memory 
allocations in cache, as in Theorem] 

The guarantee of H2, on the other hand, allows us to treat each of the tables T),..., 7}, as being managed 
by N°-independent hash functions. Since each table T; stores at most N€ keys at any given moment (with 
high probability), and since the analysis of Iceberg hashing on N€ keys can be performed with O(N?*)-wise 
independence (note, in particular, that the proof of the Iceberg lemma on m balls requires only O(m?)-wise 
independence so that the random variables a = {a;} and 8 = {beta;} are mutually independent), it follows 
that N°-independence suffices for the analysis of each individual table T;. 

Call the resulting data structure a low-randomness Iceberg hash table. We have the following 
theorem. 


Theorem 11. Consider a low-randomness Iceberg hash table whose size stays in the range [N‘~*/4, N]. 
Furthermore, suppose that the description bits for g and f fit in cache. Then the guarantees from Theorems|[2] 


aa andl hold. 


Remark 3. As was the case in Theorem|[5] (see Remark[Q), the size restriction on the hash table can be 
removed by performing random rebuilds very rarely. As in Remark[Q] this preserves the other guarantees of the 
hash table. 


B The Full Analysis of Partial Resizing 


In this section we give the full analysis of partial expansions in an Iceberg hash table (as described in Section 
4.3). 

To simplify the exposition, we shall perform our analysis as though we were using waterfall addressing 
(rather than truncated waterfall addressing). The relevant difference is that truncated waterfall addressing is 
not quite uniform, selecting some bins with a (1 + O(1/s))-factor greater likelihood than others. This factor 
is easily absorbed into the Iceberg hash table by simply reducing the entire load of the table by a factor of 
1+ O(1/s) (or by increasing h by a factor of 1 + O(1/s)). Rather than carry this factor of 1 + O(1/s) (on the 
load of the table) around with us through the analysis, we instead perform the analysis assuming uniform bin 
assignments, and then adjust the analysis at the end appropriately. 

The next lemma shows that the guarantee from Lemmaf5](i-e., the analysis of the backyard in static-size 
Iceberg hashing) continues to hold after a partial expansion. 


18 We can use the hash function to generate all of the hash functions needed for an Iceberg hash table. To evaluate the ith hash 
function on a key x, we simply compute g(x o i), where o denotes concatenation. 
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Lemma 17. Leth < polylogn. Consider any time t > t2 prior to the next partial expansion or contraction. 
Let rı be the number of bins prior to the partial expansion and rz be the number of bins after the partial 
expansion. Suppose that, during the partial expansion, the total number of records never exceeds r jh, and set 
N= roh. 

With super high probability in N, at time t, the number of records in the backyard is N/ poly(h). 
Moreover, for a given record x, the probability that x hashes to a bin gnew(x) with a non-zero floating counter 
at time t is at most 1/ poly(h). 


Proof. Consider the following two situations. 


e Situation (1): Suppose we create an Iceberg table T consisting of rı bins with capacities h + Th. We 
then insert into T the records present at time to. Finally, we duplicate on table T everything that 
happens between time to and t. 


e Situation (2): Suppose we create an Iceberg table T consisting of rg bins with capacities h + Th. We 
then insert into T the records present at time to. Finally, we duplicate on table T everything that 
happens between time to and t. 


For i € {1,2} and for any time j, let X;(7) denote the number of capacity exposers at time j in Situation 
(i) Let Y;(j) denote the set of bins b at time j in Situation (i) such that there is at least one capacity 
exposer x satisfying gnew(x) = b. By LemmalJ] we have that X;(t1) < N/poly(h) and X;(t) < N/ poly(h) 
with super high probability in N for both i € {1,2}; and that for a given record x, the probability that 
goiaz) € Yı (tı), that goa (x) € Yi(t), that gnew(x) € Yo(t1), or that gnew(x) € Y2(t) is at most 1/ poly(h). We 
shall use these bounds as the main tools for proving our lemma. 

The process for performing partial expansions is designed so that at time t, since t > to, there are only 
four ways that a record x can be in the backyard: 


1. The record x is grandfathered, and at the time freeze tı, the demand counter d for the bin gnew(a) was 
larger than gnew(x)’s capacity h + Tp. 


2. The record x is grandfathered, and at the time freeze tı, there was another record y such that 
Qnew(X) = Jnew(y) and both xz and y had the same fingerprints. 


3. The record x was inserted after time tı, and when x was inserted, there was another record y such that 
Qnew(X) = Jnew(y) and both x and y had the same fingerprints. 


4. The record x was inserted after time tı, and when x was inserted, there were no free unreserved slots in 
bin gnew(2). 


We begin by considering the records that fall into Case (1) and for which gnew(a) is not in the new chunk 
C (call this Case (1a)). The basic idea in this case will be to compare our situation to that of Situation (1) at 
time tı. Let x denote a record in Case (1a), and let Q denote the set of records in Case (la) that reside in the 
backyard at time t. Since bin b = gnew(x) is not in C, the demand counter d for bin b is equal to the number of 
records y at time tı (i.e., at the time freeze) such that goia(y) = b. Moreover, the bin b contributes only 
d— (h + Th) of those records to Q. Thus, if X is the set of records at time t4, then 


IQ] = $ | max (0, {x € X | goa(x) = b} — (h + 7). 


b=1 


But this expression is a lower bound on X4 (tı), which we know is at most N/ poly(h) with super high 
probability in N. Moreover, if we define P to be the set of bins b for which |{x € X | goia(x) = b} > h+ Th, 
then we know that P C Y; (tı), which means that the probability of a given record x satisfying goiq(x) € P is 
1/ poly(h). This completes the analysis of Case (1a). 

Next we consider the records that fall into Case (1) and for which gnew(x) is in the new chunk C (call this 
Case (1b)). The analysis in this case is very similar to that of Case (la), except that we now compare to 


19Tn both situations, time is measured from the frame of reference of the actual table that the lemma is about. That is, in each 
situation, once we have inserted the elements present at time to, we consider that point in time to be to. 
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Situation (2) at time tı. Let x denote a record in Case (1b), and let Q denote the set of records in Case (1b) 
that reside in the backyard at time t. Since bin b = gnew() is in C, the demand counter d for bin b is equal to 
the number of records y at time tı (i.e., at the time freeze) such that gnew(y) = b. Moreover, the bin b 
contributes only d — (h + Ta) of those records to Q. Thus, if X is the set of records at time tı, then 


r2 


|Q| = $ max (0, {z € X | gnew(£) = b}| — (h + Ta)) . 


b=1 


But this expression is also a lower bound on X2(t1), which we know is at most N/ poly(h) with super high 
probability in N. Moreover, if we define P to be the set of bins b for which {x € X | gnewl£) = b}| > h+ Tah, 
then we know that P C Y2(t1), which means that the probability of a given record x satisfying gnew(x) € P is 
1/ poly(h). This completes the analysis of Case (1b). 

The analysis of Cases (2) and (3) follows directly from LemmaB] In particular, the number of records x at 
time t that Cases (2) and (3) contribute to the backyard is at most N/ poly(h) with super high probability, 
and the probability of a record y hashing to a bin gnew(y) containing such a record z is at most 1/ poly(h). 

We break Case (4) into two subcases just as we did for Case (1). Consider the records x in Case (4) and 
such that gnew(a) ¢ C (call this Case (4a)). The number of such records g is at most X(t), which we know is 
at most N/ poly(h) with super high probability. Moreover, the set Y of bins containing such records z satisfies 
Y C Yı (t), meaning that the probability of a record y hashing to a bin goia (y) € Y is at most 1/ poly(A). 

Finally, consider the records x in Case (4) and such that gnew(x) € C (call this Case (4b)). The number of 
such records x is at most X(t), which we know is at most N/ poly(h) with super high probability. Moreover, 
the set Y of bins containing such records x satisfies Y C Y2(t), meaning that the probability of a record y 
hashing to a bin gnew(y) € Y is at most 1/ poly(h). Oo 


We can extend the preceding lemma to consider times t € [t1, ta]. 


Lemma 18. Consider any time t € [t1, t2] prior to the next partial expansion or contraction. Let r1 be the 
number of bins prior to the partial expansion and r2 be the number of bins after the partial expansion. Suppose 
that, during the partial expansion, the total number of records never exceeds rıh, and set N = rah. Finally, let 
k be the number of records x in the backyard at time tı and let p be the probability that a record x hashes to a 
bin gota(x) with a non-zero floating counter at time ty. 

With super high probability in N, at time t, the number of records in the backyard is at most 
k + N/poly(h). Moreover, for a given record x, the probability that x hashes to a bin gnew(x) with a non-zero 
floating counter at time t is at most p + 1/ poly(h). 


Proof. This follows by the same analysis as Lemma(17] except that we also consider a fifth way that records 
can reside in the backyard, which is that they resided in the backyard at time t1. O 


Performing partial contractions and hysteresis. Partial contractions can be implemented using the 
same approach as is described above for partial expansions. We perform a time freeze tı at which point we 
reserve space in each bin for the records that are currently present and that wish to reside in that bin 
(reserving up to h + Tn slots in each bin). We then perform the Reshuffling Phase in the same way as for 
partial expansions. 

Partial expansions and contractions are performed via hysteresis. Consider a chunk C that is the j-th 
chunk in a doubling from 2% bins to 2¢*! bins. Let E = 2% /s denote the number of bins in C. We must 
perform partial expansions and contractions so that, whenever the number of records is h - (2° + (j — 1) E) or 
larger, the chunk C is included in the table, and whenever the number of records is h - (2 + (j — 2)E) or 
smaller, the chunk C is not included in the table. To achieve this, whenever the number of records reaches 
h- (2° + (j — 2)E + (2/3)E), if C is not yet present, then we perform a partial expansion during the next 
hE/3 operations. Likewise, whenever the number of records reaches h - (2°(j — 2)E + (1/3)E), if C is present, 
then we perform a partial contraction during the next hE/3 operations. These thresholds ensure that partial 
expansions and partial contractions do not overlap temporally. 

By combining the analyses of partial expansions and partial contractions, we arrive at a (super) high 
probability guarantee in terms of the table’s current size n. 
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Lemma 19. Consider a dynamic Iceberg table maintained with s = Wh, and suppose the size n stays in the 
range such that h < polylog n. 

Consider a time t, and let n be the current size of the table. Then w.s.h.p. inn, there are at most 
n/ poly(h) records in the backyard. Moreover, for a given key x, the probability that bin g(x) has a non-zero 
floating counter is at most 1/ poly(h). 


Proof. This follows directly from the analysis of the probabilistic guarantees during and after each partial 
resize. O 


Performing cache efficient resizing. Next we consider the question of how to implement partial 
expansions and partial contractions efficiently in the EM model. Suppose s = Vh, and suppose that the size 
of a cache line is B = Q(h). Finally, suppose that we have a cache of size at least M = ch!°B + (s logn) for 
some sufficiently large constant c. (Note that s logn space is simply for storing pointers to the chunks of 
memory that have been allocated during each partial expansion in the table’s history.) 

We begin by describing how to efficiently implement the Reshuffling Phase of a partial expansion. 
Traversing the backyard and attempting to move grandfathered records back down to the front yard requires 
only O(n/ poly(h)) cache misses (w.s.h.p.), where n is the current table size. Reshuffling records within the 
front yard is slightly more subtle, however, since we wish to move roughly O(n/s) records with much fewer 
than n/s cache misses. 

Say that a record x is promoted k-levels during a partial expansion if, due to the partial expansion, the 
number of bits in x’s bin position that are determined by m(x) increases by k. 

Partition the bins into reshuffling groups where the reshuffling group of each bin is determined by the 
bin number modulo Æ. There are O(hs) < O(h!) elements in each rearrangement group. Importantly, any 
record that is promoted fewer than log h levels has the property that, when it is promoted, its rearrangement 
group doesn’t change (i.e., it’s moved between two bins in the same rearrangement group). We perform the 
partial expansion group by group, loading a given reshuffling group into cache, and then performing the 
reshufflings for that group. Once a group is loaded into cache, moving records around within the group is free 
(in terms of cache misses). Promoting records more than log h levels is not free, and each such promotion may 
incur up to O(1) cache misses. 

By analyzing the above scheme, we can bound the number of cache misses needed to perform the 
Reshuffling Phase. 


Lemma 20. Suppose s = Vh, and suppose that the size of a cache line is B = Q(h). Finally, suppose that we 
have a cache of size at least M = chB + Vhlogn for some sufficiently large constant c. Letn satisfying 

h < polylog n be the current table size, and suppose we perform a partial expansion. Then the Reshuffling 
Phase can be implemented with O(n/Vh) cache misses, w.s.h.p. inn. 


Proof. The number of cache misses spent on grandfathered records in the backyard is O(n/ poly(h)) w.s.h.p. 
in n. The number of cache misses spent rearranging records within each rearrangement group is O(n/h), 
since each rearrangement group is loaded into and out of cache once. Finally, since each record x has 
probability at most 1/h of being promoted log h or more levels (all at once) during the partial expansior29, we 
have by a Chernoff bound that the number of records x that are promoted log h or more levels is O(n/h) 
w.s.h.p. in n. O 


So far we have described how to perform the Reshuffling Phase efficiently. Notice, however, that the 
Preprocessing Phase can be implemented with the same grouping approach, and that the approach also 
works for the phases of partial contractions. Thus we have the following lemma: 


Lemma 21. Suppose s = Vh, and suppose that the size of a cache line is B = Q(h). Finally, suppose that we 
have a cache of size at least M = chB + Vhlogn for some sufficiently large constant c. Let n satisfying 

h < polylog n be the current table size. Then a partial expansion or contraction can be implemented to incur 
at most O(n/h) cache misses w.s.h.p. inn. 


20In particular, in the event that x is promoted by logh or more levels, we must have that Pa(x) = 1 and that 


Pa-1(#),-. +, Pa—tog n+1(«) = 0, where P(x) is the promotion sequence and the current number of bins is in the range (2°, JPEN, 
This, in turn, happens with probability 1/h. 
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Since each partial expansion and contraction incurs at most O(n/h) cache misses (w.s.h.p.), and is spread 
across @(hE) = n/Vh operations, we can randomize on which operations the cache misses occur, so that each 
operation incurs only O(1/Vh) resizing cache misses in expectation. Thus we arrive at the following lemma: 


Lemma 22. Consider a dynamic Iceberg hash table whose size n stays in the range such that 
h < O(log n/loglogn), and suppose that we set s = Vh. Suppose that the table used in the backyard supports 
constant-time operations (w.h.p. inn) and has load factor at least 1/ poly(h). Finally, suppose that we have a 
cache of size at least M = ch’1>B + Vhlogn for some sufficiently large constant c, and suppose that each bin 
is stored in a cache line of size O(B). 

The expected number of cache misses incurred by a given operation is 1 + O(1/Vh). 


Proof. We have already shown that the expected number of cache misses incurred by work spent on resizing 
is O(1/ Vh). Whenever a partial expansion or contraction is occurring, each record x has an 

O(1/s) = O(1/Vh) chance of having goia(x) # gnew(x), in which case an operation on x may be forced to visit 
multiple bins. In the cases where goia(Z) = Gnew(X), we can analyze the cache misses just as in Theorem] 
except that we now use Lemmaf[I9]in place of Lemma|5] o 


We can now prove Theorem [5] 


Proof of Theorem[5| The claim of cache efficiency follows from Lemma 22] The claim of stability follows from 
the design of the data structure. 

To prove the claim of time efficiency, we must verify that each partial expansion /contraction can be 
completed in time O(n/s). By Lemma[i9] the backyard has size O(n/ poly(h)) < O(n/s) with high 
probability, and thus we can ignore resizing time spent on records in the backyard. By Lemma[L0] the time 
spent in the Preprocessing Phase and the Reshuffling Phase on records in the front yard is O(n/s) with high 
probability in n. The other time costs (not from resizing) can be analyzed just as in Theorem] 

Finally we prove space efficiency. By Lemma([19] the space consumed by the backyard is negligible. On the 
other hand, our scheme always maintains an average load of (1 — O(1/s))h on the bins in the table[24] Since 
each bin takes space (1 + O(log h/h)), the claim of space efficiency follows. o 


21 This differs from the static case, where we maintained an average load of h. The difference stems from (a) the fact that we 
perform resizing using hysteresis, allowing for the load to change by a factor of 1 + O(1/s) before performing resizing, and (b) the 
fact that we must decrease the load by a factor of 1 — O(1/s) in order to account for truncated waterfall addressing being slightly 
nonuniform. 
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